Method of diagnosing breast cancer

ABSTRACT

Objective methods for detecting and diagnosing breast cancer (BRC) are described herein. In one embodiment, the diagnostic method involves determining the expression level of a BRC-associated gene that discriminates between BRC cells and normal cells. In another embodiment, the diagnostic method involves determining the expression level of a BRC-associated gene that discriminates among BRC cells, between DCIS and IDC cells. The present invention further provides means for predicting and preventing breast cancer metastasis using BRC-associated genes having unique altered expression patterns in breast cancer cells with lymph-node metastasis. Finally, the present invention provides methods of screening for therapeutic agents useful in the treatment of breast cancer, methods of treating breast cancer and method for vaccinating a subject against breast cancer.

This application is a division of U.S. application Ser. No. 12/416,900,filed Apr. 1, 2009, which is a divisional of U.S. application Ser. No.10/573,297, filed Apr. 9, 2007, now U.S. Pat. No. 7,531,300, which isthe U.S. National Phase entry under 35 U.S.C. §371 of InternationalApplication No. PCT/JP2004/014438, filed Sep. 24, 2004, which claims thebenefit of U.S. Provisional Application Ser. No. 60/505,571 filed Sep.24, 2003, the contents of each are hereby incorporated by reference inits entirety.

FIELD OF THE INVENTION

The present invention relates to methods of detecting and diagnosingbreast cancer as well as methods of treating and preventing breastcancer and breast cancer metastasis.

BACKGROUND OF THE INVENTION

Breast cancer, a genetically heterogeneous disease, is the most commonmalignancy in women. An estimation of approximately 800000 new caseswere reported each year worldwide (Parkin D M, Pisani P, Ferlay J(1999). CA Cancer J Clin 49: 33-64). Mastectomy is the first concurrentoption for the treatment of this disease. Despite surgical removal ofthe primary tumors, relapse at local or distant sites may occur due toundetectable micrometastasis (Saphner T, Tommey D C, Gray R (1996). JClin Oncol, 14, 2738-2749.) at the time of diagnosis. Cytotoxic agentsare usually administered as adjuvant therapy after surgery aiming tokill those residual or pre malignant cells.

Treatment with conventional chemotherapeutic agents is often empiricaland is mostly based on histological tumor parameters, and in the absenceof specific mechanistic understanding. Target-directed drugs aretherefore becoming the bedrock treatment for breast cancer. Tamoxifenand aromatase inhibitors, two representatives of its kind, have beenproved to have great responses used as adjuvant or chemoprevention inpatients with metastasized breast cancer (Fisher B, Costantino J P,Wickerham D L, Redmond C K, Kavanah M, Cronin W M, Vogel V, Robidoux A,Dimitrov N, Atkins J, Daly M, Wieand S, Tan-Chiu E, Ford L, Wolmark N(1998). J Natl Cancer Inst, 90, 1371-1388; Cuzick J (2002). Lancet 360,817-824). However the drawback is that only patients expressed estrogenreceptors are sensitive to these drugs. A recent concerns were evenraised regarding their side effects particularly lay on the possibilityof causing endometrial cancer for long term tamoxifen treatment as wellas deleterious effect of bone fracture in the postmenopausal women inaromatase prescribed patients (Coleman R E (2004). Oncology. 18 (5 Suppl3), 16-20). Owing to the emergence of side effect and drug resistance,it is obviously necessarily to search novel molecular targets forselective smart drugs on the basis of characterized mechanisms ofaction.

Breast cancer is a complex disease associated with numerous geneticchanges. Little is known about whether these abnormalities are the causeof breast tumorigenesis, although it has been reported that they occurby a multistep process which can be broadly equated to transformation ofnormal cells, via the steps of atypical ductal hyperplasia, ductalcarcinoma in situ (DCIS) and invasive ductal carcinoma (IDC). There isevidence that only a portion of premalignant lesions are committed toprogression to invasive cancer while the other lesions undergospontaneous regression. This explanation of molecular participation,which leads to development of primary breast cancer, its progression,and its formation of metastases, is the main focus for new strategiestargeted at prevention and treatment.

Gene-expression profiles generated by cDNA microarray analysis canprovide considerably more detail about the nature of individual cancersthan traditional histopathological methods are able to supply. Thepromise of such information lies in its potential for improving clinicalstrategies for treating neoplastic diseases and developing novel drugs(Petricoin, E. F., 3rd, Hackett, J. L., Lesko, L. J., Puri, R. K.,Gutman, S. I., Chumakov, K., Woodcock, J., Feigal, D. W., Jr., Zoon, K.C., and Sistare, F. D. Medical applications of microarray technologies:a regulatory science perspective. Nat Genet, 32 Suppl: 474-479, 2002.).To this aim, the present inventors have analyzed the expression profilesof tumor or tumors from various tissues by cDNA microarrays (Okabe, H.et al., Genome-wide analysis of gene expression in human hepatocellularcarcinomas using cDNA microarray: identification of genes involved inviral carcinogenesis and tumor progression. Cancer Res, 61: 2129-2137,2001; Hasegawa, S. et al., Genome-wide analysis of gene expression inintestinal-type gastric cancers using a complementary DNA microarrayrepresenting 23,040 genes. Cancer Res, 62: 7012-7017, 2002; Kaneta, Y.et al., and Ohno, R. Prediction of Sensitivity to STI571 among ChronicMyeloid Leukemia Patients by Genome-wide cDNA Microarray Analysis. Jpn JCancer Res, 93: 849-856, 2002; Kaneta, Y. et al., Genome-wide analysisof gene-expression profiles in chronic myeloid leukemia cells using acDNA microarray. Int J Oncol, 23: 681-691, 2003; Kitahara, O. et al.,Alterations of gene expression during colorectal carcinogenesis revealedby cDNA microarrays after laser-capture microdissection of tumor tissuesand normal epithelia. Cancer Res, 61: 3544-3549, 2001; Lin, Y. et al.Molecular diagnosis of colorectal tumors by expression profiles of 50genes expressed differentially in adenomas and carcinomas. Oncogene, 21:4120-4128, 2002; Nagayama, S. et al., Genome-wide analysis of geneexpression in synovial sarcomas using a cDNA microarray. Cancer Res, 62:5859-5866, 2002; Okutsu, J. et al., Prediction of chemosensitivity forpatients with acute myeloid leukemia, according to expression levels of28 genes selected by genome-wide complementary DNA microarray analysis.Mol Cancer Ther, 1: 1035-1042, 2002; Kikuchi, T. et al., Expressionprofiles of non-small cell lung cancers on cDNA microarrays:identification of genes for prediction of lymph-node metastasis andsensitivity to anti-cancer drugs. Oncogene, 22: 2192-2205, 2003.).

Recent examination into the expression levels of thousands of genesthrough the use of cDNA microarrays have resulted in the discovery ofdistinct patterns in different types of breast cancer (Sgroi, D. C. etal., In vivo gene expression profile analysis of human breast cancerprogression. Cancer Res, 59: 5656-5661, 1999; Sorlie, T. et al., Geneexpression patterns of breast carcinomas distinguish tumor subclasseswith clinical implications. Proc Natl Acad Sci U S A, 98: 10869-10874,2001; Kauraniemi, P. et al., New amplified and highly expressed genesdiscovered in the ERBB2 amplicon in breast cancer by cDNA microarrays.Cancer Res, 61: 8235-8240, 2001; Gruvberger, S. et al., S. Estrogenreceptor status in breast cancer is associated with remarkably distinctgene expression patterns. Cancer Res, 61: 5979-5984, 2001; Dressman, M.et al., Gene expression profiling detects gene amplification anddifferentiates tumor types in breast cancer. Cancer Res, 63: 2194-2199,2003).

Studies into gene-expression profiles in breast cancers have resulted inthe identification of genes that may serve as candidates for diagnosticmarkers or prognosis profiles. However, these data, derived primarilyfrom tumor masses, cannot adequately reflect expressional changes duringbreast carcinogenesis, because breast cancer cells exist as a solid masswith a highly inflammatory reaction and containing various cellularcomponents. Therefore, previously published microarray data is likely toreflect heterogenous profiles.

Studies designed to reveal mechanisms of carcinogenesis have alreadyfacilitated the identification of molecular targets for certainanti-tumor agents. For example, inhibitors of farnesyltransferase (FTIs)which were originally developed to inhibit the growth-signaling pathwayrelated to Ras, whose activation depends on post-translationalfarnesylation, have been shown to be effective in treating Ras-dependenttumors in animal models (He et al., Cell 99:335-45 (1999)). Similarly,clinical trials on humans using a combination of anti-cancer drugs andthe anti-HER2 monoclonal antibody, trastuzumab, with the aim ofantagonizing the proto-oncogene receptor HER2/neu have achieved improvedclinical response and overall survival of breast-cancer patients (Lin etal., Cancer Res 61:6345-9 (2001)). Finally, a tyrosine kinase inhibitor,STI-571, which selectively inactivates bcr-abl fusion proteins, has beendeveloped to treat chronic myelogenous leukemias wherein constitutiveactivation of bcr-abl tyrosine kinase plays a crucial role in thetransformation of leukocytes. Agents of these kinds are designed tosuppress oncogenic activity of specific gene products (Fujita et al.,Cancer Res 61:7722-6 (2001)). Accordingly, it is apparent that geneproducts commonly up-regulated in cancerous cells may serve as potentialtargets for developing novel anti-cancer agents.

It has been further demonstrated that CD8+ cytotoxic T lymphocytes(CTLs) recognize epitope peptides derived from tumor-associated antigens(TAAs) presented on the MHC Class 1 molecule, and lyse tumor cells.Since the discovery of the MAGE family as the first example of TAAs,many other TAAs have been discovered using immunological approaches(Boon, Int J Cancer 54: 177-80 (1993); Boon and van der Bruggen, J ExpMed 183: 725-9 (1996); van der Bruggen et al., Science 254: 1643-7(1991); Brichard et al., J Exp Med 178: 489-95 (1993); Kawakami et al.,J Exp Med 180: 347-52 (1994)). Some of the newly discovered TAAs arecurrently undergoing clinical development as targets of immunotherapy.TAAs discovered so far include MAGE (van der Bruggen et al., Science254: 1643-7 (1991)), gp100 (Kawakami et al., J Exp Med 180: 347-52(1994)), SART (Shichijo et al., J Exp Med 187: 277-88 (1998)), andNY-ESO-1 (Chen et al., Proc Natl Acad Sci USA 94: 1914-8 (1997)). On theother hand, gene products demonstrated to be specifically over-expressedin tumor cells have been shown to be recognized as targets inducingcellular immune responses. Such gene products include p53 (Umano et al.,Brit J Cancer 84: 1052-7 (2001)), HER2/neu (Tanaka et al., Brit J Cancer84: 94-9 (2001)), CEA (Nukaya et al., Int J Cancer 80: 92-7 (1999)), andso on.

In spite of significant progress in basic and clinical researchconcerning TAAs (Rosenberg et al., Nature Med 4: 321-7 (1998); Mukherjiet al., Proc Natl Acad Sci USA 92: 8078-82 (1995); Hu et al., Cancer Res56: 2479-83 (1996)), only limited number of candidate TAAs for thetreatment of adenocarcinomas, including colorectal cancer, are currentlyavailable. TAAs abundantly expressed in cancer cells yet whoseexpression is restricted to cancer cells would be promising candidatesas immunotherapeutic targets. Further, identification of new TAAsinducing potent and specific antitumor immune responses is expected toencourage clinical use of peptide vaccination strategies for varioustypes of cancer (Boon and can der Bruggen, J Exp Med 183: 725-9 (1996);van der Bruggen et al., Science 254: 1643-7 (1991); Brichard et al., ExpMed 178: 489-95 (1993); Kawakami et al., J Exp Med 180: 347-52 (1994);Shichijo et al., J Exp Med 187: 277-88 (1998); Chen et al., Proc NatlAcad Sci USA 94: 1914-8 (1997); Harris, J Natl Cancer Inst 88: 1442-5(1996); Butterfield et al., Cancer Res 59: 3134-42 (1999); Vissers etal., Cancer Res 59: 5554-9 (1999); van der Burg et al., J Immunol 156:3308-14 (1996); Tanaka et al., Cancer Res 57: 4465-8 (1997); Fujie etal., Int J Cancer 80: 169-72 (1999); Kikuchi et al., Int J Cancer 81:459-66 (1999); Oiso et al., Int J Cancer 81: 387-94 (1999)).

It has been repeatedly reported that peptide-stimulated peripheral bloodmononuclear cells (PBMCs) from certain healthy donors producesignificant levels of IFN-γ in response to the peptide, but rarely exertcytotoxicity against tumor cells in an HLA-A24 or -A0201 restrictedmanner in ⁵¹Cr-release assays (Kawano et al., Cancer Res 60: 3550-8(2000); Nishizaka et al., Cancer Res 60: 4830-7 (2000); Tamura et al.,Jpn J Cancer Res 92: 762-7 (2001)). However, both of HLA-A24 andHLA-A0201 are popular HLA alleles in the Japanese, as well as theCaucasian populations (Date et al., Tissue Antigens 47: 93-101 (1996);Kondo et al., J Immunol 155: 4307-12 (1995); Kubo et al., J Immunol 152:3913-24 (1994); Imanishi et al., Proceeding of the eleventhInternational Histocompatibility Workshop and Conference OxfordUniversity Press, Oxford, 1065 (1992); Williams et al., Tissue Antigen49: 129 (1997)). Thus, antigenic peptides of carcinomas presented bythese HLAs may be especially useful for the treatment of carcinomasamong Japanese and Caucasians. Further, it is known that the inductionof low-affinity CTL in vitro usually results from the use of peptide ata high concentration, generating a high level of specific peptide/MHCcomplexes on antigen presenting cells (APCs), which will effectivelyactivate these CTL (Alexander-Miller et al., Proc Natl Acad Sci USA 93:4102-7 (1996)).

Accordingly, in an effort to understand the carcinogenic mechanismsassociated with cancer and identify potential targets for developingnovel anti-cancer agents, the present inventors performed large scalegenome-wide analyses of gene expression profiles found in purifiedpopulations of breast cancer cells, including 12 ductal carcinomas insitu (DCIS) and 69 invasive ductal carcinomas (IDC), using a cDNAmicroarray representing 23,040 genes.

SUMMARY OF THE INVENTION

The present invention is based on the discovery of a pattern of geneexpression that correlates with breast cancer (BRC). Genes that aredifferentially expressed in breast cancer are collectively referred toherein as “BRC nucleic acids” or “BRC polynucleotides” and thecorresponding encoded polypeptides are referred to as “BRC polypeptides”or “BRC proteins.”

Accordingly, the present invention provides a method of diagnosing ordetermining a predisposition to breast cancer in a subject bydetermining an expression level of a BRC-associated gene in apatient-derived biological sample, such as tissue sample. The term“BRC-associated gene” refers to a gene that is characterized by anexpression level which differs in a BRC cell as compared to a normalcell. A normal cell is one obtained from breast tissue. In the contextof the present invention, a BRC-associated gene is a gene listed inTables 3-8 (i.e., genes of BRC Nos. 123-512). An alteration, e.g., anincrease or decrease in the level of expression of a gene as compared toa normal control level of the gene, indicates that the subject suffersfrom or is at risk of developing BRC.

In the context of the present invention, the phrase “control level”refers to a protein expression level detected in a control sample andincludes both a normal control level and an breast cancer control level.A control level can be a single expression pattern derived from a singlereference population or from a plurality of expression patterns. Forexample, the control level can be a database of expression patterns frompreviously tested cells. A “normal control level” refers to a level ofgene expression detected in a normal, healthy individual or in apopulation of individuals known not to be suffering from breast cancer.A normal individual is one with no clinical symptoms of breast cancer.On the other hand, a “BRC control level” refers to an expression profileof BRC-associated genes found in a population suffering from BRC.

An increase in the expression level of one or more BRC-associated geneslisted in Tables 3, 5, and 7 (i.e., genes of BRC Nos. 123-175, 374-398,and 448-471) detected in a test sample as compared to a normal controllevel indicates that the subject (from which the sample was obtained)suffers from or is at risk of developing BRC. In contrast, a decrease inthe expression level of one or more BRC-associated genes listed inTables 4, 6, and 8 (i.e., genes of BRC Nos. 176-373, 399-447, and472-512) detected in a test sample compared to a normal control levelindicates said subject suffers from or is at risk of developing BRC.

Alternatively, expression of a panel of BRC-associated genes in a samplecan be compared to a BRC control level of the same panel of genes. Asimilarity between a sample expression and BRC control expressionindicates that the subject (from which the sample was obtained) suffersfrom or is at risk of developing BRC.

According to the present invention, gene expression level is deemed“altered” when gene expression is increased or decreased 10%, 25%, 50%as compared to the control level.

Alternatively, an expression level is deemed “increased” or “decreased”when gene expression is increased or decreased by at least 0.1, at least0.2, at least 1, at least 2, at least 5, or at least 10 or more fold ascompared to a control level. Expression is determined by detectinghybridization, e.g., on an array, of a BRC-associated gene probe to agene transcript of the patient-derived tissue sample.

In the context of the present invention, the patient-derived tissuesample is any tissue obtained from a test subject, e.g., a patient knownto or suspected of having BRC. For example, the tissue may contains anepithelial cell. More particularly, the tissue may be an epithelial cellfrom a breast ductal carcinoma.

The present invention also provides a BRC reference expression profile,comprising a gene expression level of two or more of BRC-associatedgenes listed in Tables 3-8. Alternatively, the BRC reference expressionprofile may comprise the levels of expression of two or more ofBRC-associated genes listed in Tables 3, 5, and 7, or BRC-associatedgenes listed in Tables 4, 6, and 8.

The present invention further provides methods of identifying an agentthat inhibits or enhances the expression or activity of anBRC-associated gene, e.g. a BRC-associated gene listed in Tables 3-8, bycontacting a test cell expressing a BRC-associated gene with a testcompound and determining the expression level of the BRC-associated geneor the activity of its gene product. The test cell may be an epithelialcell, such as an epithelial cell obtained from a breast carcinoma. Adecrease in the expression level of an up-regulated BRC-associated geneor the activity of its gene product as compared to a normal controllevel or activity of the gene or gene product indicates that the testagent is an inhibitor of the BRC-associated gene and may be used toreduce a symptom of BRC, e.g. the expression of one or moreBRC-associated genes listed in Tables 3, 5, and 7. Alternatively, anincrease in the expression level of a down-regulated BRC-associated geneor the activity of its gene product as compared to a normal controllevel or activity of the gene or gene product indicates that the testagent is an enhancer of expression or function of the BRC-associatedgene and may be used to reduce a symptom of BRC, e.g., theunder-expression of one or more BRC-associated genes listed in Tables 4,6, and 8.

The present invention also provides a kit comprising a detection reagentwhich binds to one or more BRC nucleic acids or BRC polypeptides. Alsoprovided is an array of nucleic acids that binds to one or more BRCnucleic acids.

Therapeutic methods of the present invention include a method oftreating or preventing BRC in a subject including the step ofadministering to the subject an antisense composition. In the context ofthe present invention, the antisense composition reduces the expressionof the specific target gene. For example, the antisense composition maycontain a nucleotide which is complementary to a BRC-associated genesequence selected from the group consisting of the BRC-associated geneslisted in Tables 3, 5, and 7. Alternatively, the present method mayinclude the steps of administering to a subject a small interfering RNA(siRNA) composition. In the context of the present invention, the siRNAcomposition reduces the expression of a BRC nucleic acid selected fromthe group consisting of the BRC-associated genes listed in Tables 3, 5,and 7. In yet another method, the treatment or prevention of BRC in asubject may be carried out by administering to a subject a ribozymecomposition. In the context of the present invention, the nucleicacid-specific ribozyme composition reduces the expression of a BRCnucleic acid selected from the group consisting of the BRC-associatedgenes listed in Tables 3, 5, and 7. Actually, the inhibition effect ofthe siRNA for BRC-associated genes listed in the Tables was confirmed.For example, it has been clearly shown that the siRNA for BRC-456 ofTable 7 (GenBank Accession Nos. AF237709 and NM 018492, TOPK; T-LAKcell-originated protein kinase; SEQ ID NOS:48-51) inhibits cellproliferation of breast cancer cells in the examples section. Thus, inthe present invention, BRC-associated genes listed in Tables 3, 5, and7, especially BRC-456, are preferable therapeutic targets of breastcancer. Other therapeutic methods include those in which a subject isadministered a compound that increases the expression of one or more ofthe BRC-associated genes listed in Tables 4, 6, and 8 or the activity ofa polypeptide encoded by one or more of the BRC-associated genes listedin Tables 4, 6, and 8.

The present invention also includes vaccines and vaccination methods.For example, a method of treating or preventing BRC in a subject mayinvolve administering to the subject a vaccine containing a polypeptideencoded by a nucleic acid selected from the group consisting ofBRC-associated genes listed in Tables 3, 5, and 7 or an immunologicallyactive fragment of such a polypeptide. In the context of the presentinvention, an immunologically active fragment is a polypeptide that isshorter in length than the full-length naturally-occurring protein yetwhich induces an immune response analogous to that induced by thefull-length protein. For example, an immunologically active fragmentshould be at least 8 residues in length and capable of stimulating animmune cell such as a T cell or a B cell. Immune cell stimulation can bemeasured by detecting cell proliferation, elaboration of cytokines(e.g., IL-2), or production of an antibody.

Additionally, the present invention provides target molecules fortreating or preventing metastasis of breast cancer. According to thepresent invention, genes listed in Table 11 (i.e., genes of BRC Nos.719-752) were identified as genes having unique altered expressionpatterns in breast cancer cells with lymph-node metastasis. Thus,metastasis of breast cancer can be treated or prevented via thesuppression of the expression or activity of up-regulated genes or theirgene products selected from the group consisting of VAMP3, MGC11257,GSPT1, DNM2, CFL1, CLNS1A, SENP2, NDUFS3, NOP5/NOP58, PSMD13, SUOX,HRB2, LOC154467, THTPA, ZRF1, LOC51255, DEAF1, NEU1, UGCGL1, BRAF, TUFM,FLJ10726, DNAJB1, AP4S1, and MRPL40. Alternatively, metastasis of breastcancer can be treated or prevented by enhancing the expression oractivity of UBA52, GenBank Acc# AA634090, CEACAM3, C21 orf97, KIAA1040,EEF1D, FUS, GenBank Acc# AW965200, and KIAA0475 in cancerous cells.

The present invention also provides methods for predicting metastasis ofbreast cancer. Specifically, the present method comprises the step ofmeasuring the expression level of marker genes selected from the groupconsisting of genes listed in Table 11. These marker genes areidentified herein as genes having unique altered expression patterns inbreast cancer cells of patients with lymph node metastasis. Therefore,metastasis of the breast cancer in a subject can be predicted bydetermining whether the expression level detected in a sample derivedfrom the subject is closer to the mean expression level of lymph nodemetastasis positive cases or negative cases in reference samples.

Among the up-regulated genes, we identified A7870, designed T-LAKcell-originated protein kinase (TOPK), that was more than three-foldoverexpressed in 30 of 39 (77%) breast cancer cases which were able toobtain expression data, especially in 29 of 36 (81%) cases with invasiveductal carcinoma specimens. Subsequent semi-quantitative RT-PCR alsoconfirmed that A7870 were up-regulated in 7 of 12 clinical breast cancersamples and 17 of 20 breast cancer cell lines, compared to normal humanorgans including breast ductal cells or normal breast. Northern blotanalyses revealed that the A7870 transcript was expressed only in breastcancer cell lines and normal human testis and thymus. Immunocytochemicalstaining with TOPK antibody shows that subcellular localization ofendogenous A7870 was observed in the cytoplasmic and around the nuclearmembrane in breast cancer cell lines, T47D, BT20 and HBC5. Treatment ofbreast cancer cells with small interfering RNAs (siRNAs) effectivelyinhibited expression of A7870 and suppressed cell/tumor growth of breastcancer cell lines, T47D and BT-20, suggesting that this gene plays a keyrole in cell growth proliferation. These findings suggest thatoverexpression of A7870 might be involved in breast tumorigenesis, andpromising strategies for specific treatment for breast cancer patients.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Although methods and materialssimilar or equivalent to those described herein can be used in thepractice or testing of the present invention, suitable methods andmaterials are described below. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference herein in their entirety. In case of conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative only and not intendedto be limiting. One advantage of the methods described herein is thatthe disease is identified prior to detection of overt clinical symptomsof breast cancer. Other features and advantages of the invention will beapparent from the following detailed description, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts images of premicrodissected (lane A), postmicrodissected(lane B), and the microdissected cells (lane C). Microdissection ofDCIS, IDC cells and normal breast ductal epithelial cells was performedusing Laser microbeam microdissection (LMM). DCIS cells (10326T case),IDC cells (10502T), and normal breast ductal epithelial cell (10341N)from each specimen were microdissected from hematoxylin and eosinstained sections.

FIG. 2 depicts the results of unsupervised two-dimensional hierarchicalclustering analysis of 710 genes across 102 samples. In FIG. 2(A), eachhorizontal row represents a breast cancer patient, and each verticalcolumn shows a single gene. The color of each well represented with redand green indicates transcript levels above and below the median forthat gene across all samples, respectively. An asterisk mark indicatesthe major historical type, and a sharp mark indicates the minorhistorical type in the same case. A square indicates a duplicated case(10149a1 and 10149a1T). A black square indicates unchanged expression.ER refers to ER status measured by EIA, LN to lymph-node metastasisstatus, and ESR1 to expression profiles of ESR1 in this microarray. FIG.2(B) depicts two-dimensional hierarchical clustering analysis of 89genes across 16 samples with 2 differentiated lesion microdissected from8 breast cancer patients. FIG. 2(C) depicts clustering analysis using 25genes that showed differential expression between well- andpoorly-differentiated invasive ductal cancer cells.

FIG. 3 depicts the supervised hierarchical clustering analysis of genesusing 97 genes selected by a random-permutation test. In the horizontalrow, 41 ER-positive samples and 28 ER-negative samples (selected frompremenopausal patients) are shown. In the vertical column, 97 genes wereclustered in different branches according to similarity in relativeexpression ratios. Genes in the lower main branch were preferentiallyexpressed in a manner similar to the expression level of ESR1 as well asFIG. 2(A). Those in the upper branch were in inverse proportion of ESR1.FIG. 4 depicts genes with altered expression in DCIS relative to normalduct and in IDC relative to DCIS. FIG. 4(A) depicts a cluster of 251genes commonly up- or down-regulated in DCIS and IDC. FIG. 4(B) depictsa cluster of 74 genes having elevated or decreased expression intransition from DCIS to IDC. FIG. 4(C) depicts a cluster of 65 genesspecifically up- or down-regulated in IDC.

FIG. 5 depicts the results of semi-quatitative RT-PCR validation ofhighly expressed genes. Specifically, expression of 5 genes (AI261804,AA205444 and AA167194 in well-differentiated 12 cases, and AA676987 andH22566 in poorly-differentiated 12 cases) and GAPDH (internal control)was examined by semi-quantitative RT-PCR. Signals of the microarraycorresponded to the results of semi-quantitative RT-PCR experiments.Normal breast duct cells were prepared from normal ductal epithelialcells in premenopausal 15 patients used in this microarray. MG refers towhole human mammary gland.

FIG. 6 depicts the results of semi-quantitative RT-PCR. Expressionlevels of A7870 in tumor cells from (a) 12 breast cancer patients, (b)breast cancer cell lines (HBC4, HBC5, HBL100, HCC1937, MCF7, MDA-MB-231,SKBR3, T47D, YMB1, BT-20, BT-474, BT-549, HCC1143, HCC1500, HCC1599,MDA-MB-157, MDA-MB-4355, MDA-MB-453, OCUB-F and ZR-75-1), and normalhuman tissues are shown.

FIG. 7 depicts the results of Northern blot analysis of A7870transcripts in (a) various human tissues, and (b) breast cancer celllines and normal human vital organs.

FIG. 8 depicts the subcellular localization of (a) exogenous A7870 intransfected-COS7 cells and (b) exogenous A7870 in T47D, BT-20 and HBC5cells.

FIG. 9 depicts the supervised hierarchical clustering analysis of genesusing 206 genes selected by a random-permutation test. In the horizontalrow, 69 samples (selected from IDC patients) are depicted. In thevertical column, 97 genes were clustered in different branches accordingto similarity in relative expression ratios. Genes in the branch 1 andbranch 2 were preferentially expressed similarly to the expression levelof poorly-differentiated type and well-differentiated type.

FIG. 10(A) depicts the results of a two-dimensional hierarchicalclustering analysis using 34 genes selected by evaluation ofclassification and leave-one-out test after a random-permutation testfor establishing a predictive scoring system. Genes in the upper mainbranch were preferentially expressed in cases involving lymph nodemetastasis; those in the lower branch were more highly expressed inlymph node-negative cases. FIG. 10(B) depicts the strength of genesappearing in 7(A) for separating non-metastatic (lymph node-negative)tumors from metastatic (lymph node-positive) tumors. Squares representnode-positive cases; Triangles denote negative cases. The 17 emptysquares represents a lymph node-positive test case and the 20 emptytriangle represents lymph node-negative test cases that were not usedfor establishing prediction scores. FIG. 10(C) depicts the correlationbetween the prediction score for metastasis and clinical informationafter operation.

DETAILED DESCRIPTION OF THE INVENTION

The words “a”, “an” and “the” as used herein mean “at least one” unlessotherwise specifically indicated.

Generally breast cancer cells exist as a solid mass having a highlyinflammatory reaction and containing various cellular components.Therefore, previous published microarray data are likely to reflectheterogenous profiles.

With these issues in view, the present inventors prepared purifiedpopulations of breast cancer cells and normal breast epithelial ductcells by a method of laser-microbeam microdissection (LMM), and analyzedgenome-wide gene-expression profiles of 81 BRCs, including 12 ductalcarcinomas in situ (DCIS) and 69 invasive ductal carcinomas (IDC), usinga cDNA microarray representing 23,040 genes. These data not only shouldprovide important information about breast carcinogenesis, but shouldfacilitate the identification of candidate genes whose products mayserve as diagnostic markers and/or as molecular targets for treatment ofpatients with breast cancer and providing clinically relevantinformation.

The present invention is based, in part, on the discovery of changes inexpression patterns of multiple nucleic acids between epithelial cellsand carcinomas of patients with BRC. The differences in gene expressionwere identified using a comprehensive cDNA microarray system.

The gene-expression profiles of cancer cells from 81 BRCs, including 12DCISs and 69 IDCs, were analyzed using a cDNA microarray representing23,040 genes coupled with laser microdissection. By comparing expressionpatterns between cancer cells from patients diagnosed with BRC andnormal ductal epithelial cells purely selected with LaserMicrodissection, 102 genes (shown in Tables 3, 5 and 7) were identifiedas commonly up-regulated in BRC cells and among them 100 genes wereselected as BRC-associated genes of the present invention. Similarly,288 genes (shown in Tables 4, 6 and 8) were also identified as beingcommonly down-regulated in BRC cells. In addition, selection was made ofcandidate molecular markers having the potential to detectcancer-related proteins in serum or sputum of patients, and somepotential targets for development of signal-suppressing strategies inhuman BRC were discovered. Among them, Tables 3 and 4 provide a list ofgenes whose expression is altered between BRC, including DCIS and IDC,and normal tissue. Genes commonly up- or down-regulated in DCIS and IDCare shown in Table 3 and Table 4, respectively. Genes having elevated ordecreased expression in transition from DCIS to IDC are listed in Tables5 and 6, respectively. Furthermore, genes commonly up- or down-regulatedin IDC as compared with normal tissue are listed in Tables 7 and 8,respectively.

The differentially expressed genes identified herein find diagnosticutility as markers of BRC and as BRC gene targets, the expression ofwhich may be altered to treat or alleviate a symptom of BRC.Alternatively, the genes differentially expressed between DCIS and IDCidentified herein find diagnostic utility as markers for distinguishingIDC from DCIS and as BRC gene targets, the expression of which may bealtered to treat or alleviate a symptom of IDC.

The genes whose expression level is modulated (i.e., increased ordecreased) in BRC patients are summarized in Tables 3-8 and arecollectively referred to herein as “BRC-associated genes”, “BRC nucleicacids” or “BRC polynucleotides” and the corresponding encodedpolypeptides are referred to as “BRC polypeptides” or “BRC proteins.”Unless indicated otherwise, “BRC” refers to any of the sequencesdisclosed herein. (e.g., BRC-associated genes listed in Tables 3-8).Genes that have been previously described are presented along with adatabase accession number.

By measuring expression of the various genes in a sample of cells, BRCcan be diagnosed. Similarly, measuring the expression of these genes inresponse to various agents can identify agents for treating BRC.

The present invention involves determining (e.g., measuring) theexpression of at least one, and up to all the BRC-associated geneslisted in Tables 3-8. Using sequence information provided by theGenBank™ database entries for known sequences, the BRC-associated genescan be detected and measured using techniques well known to one ofordinary skill in the art. For example, sequences within the sequencedatabase entries corresponding to BRC-associated genes, can be used toconstruct probes for detecting RNA sequences corresponding toBRC-associated genes in, e.g., Northern blot hybridization analyses.Probes typically include at least 10, at least 20, at least 50, at least100, or at least 200 nucleotides of a reference sequence. As anotherexample, the sequences can be used to construct primers for specificallyamplifying the BRC nucleic acid in, e.g., amplification-based detectionmethods, such as reverse-transcription based polymerase chain reaction.

Expression level of one or more of BRC-associated genes in a test cellpopulation, e.g., a patient-derived tissues sample, is then compared tothe expression level(s) of the same gene(s) in a reference population.The reference cell population includes one or more cells for which thecompared parameter is known, i.e., breast ductal carcinoma cells (e.g.,BRC cells) or normal breast ductal epithelial cells (e.g., non-BRCcells).

Whether or not a pattern of gene expression in a test cell population ascompared to a reference cell population indicates BRC or apredisposition thereto depends upon the composition of the referencecell population. For example, if the reference cell population iscomposed of non-BRC cells, a similarity in gene expression patternbetween the test cell population and the reference cell populationindicates the test cell population is non-BRC. Conversely, if thereference cell population is made up of BRC cells, a similarity in geneexpression profile between the test cell population and the referencecell population indicates that the test cell population includes BRCcells.

A level of expression of a BRC marker gene in a test cell population isconsidered “altered” if it varies from the expression level of thecorresponding BRC marker gene in a reference cell population by morethan 1.1, more than 1.5, more than 2.0, more than 5.0, more than 10.0 ormore fold.

Differential gene expression between a test cell population and areference cell population can be normalized to a control nucleic acid,e.g. a housekeeping gene. For example, a control nucleic acid is onewhich is known not to differ depending on the cancerous or non-cancerousstate of the cell. The expression level of a control nucleic acid can beused to normalize signal levels in the test and reference populations.Exemplary control genes include, but are not limited to, e.g., β-actin,glyceraldehyde 3-phosphate dehydrogenase and ribosomal protein P1.

The test cell population can be compared to multiple reference cellpopulations. Each of the multiple reference populations may differ inthe known parameter. Thus, a test cell population may be compared to afirst reference cell population known to contain, e.g., BRC cells, aswell as a second reference population known to contain, e.g., non-BRCcells (normal cells). The test cell may be included in a tissue type orcell sample from a subject known to contain, or suspected of containing,BRC cells.

The test cell is obtained from a bodily tissue or a bodily fluid, e.g.,biological fluid (such as blood or sputum, for example). For example,the test cell may be purified from breast tissue. Preferably, the testcell population comprises an epithelial cell. The epithelial cell ispreferably from a tissue known to be or suspected to be a breast ductalcarcinoma.

Cells in the reference cell population should be derived from a tissuetype similar to that of the test cell. Optionally, the reference cellpopulation is a cell line, e.g. a BRC cell line (i.e., a positivecontrol) or a normal non-BRC cell line (i.e., a negative control).Alternatively, the control cell population may be derived from adatabase of molecular information derived from cells for which theassayed parameter or condition is known.

The subject is preferably a mammal. Exemplary mammals include, but arenot limited to, e.g., a human, non-human primate, mouse, rat, dog, cat,horse, or cow.

Expression of the genes disclosed herein can be determined at theprotein or nucleic acid level, using methods known in the art. Forexample, Northern hybridization analysis, using probes whichspecifically recognize one or more of these nucleic acid sequences canbe used to determine gene expression. Alternatively, gene expression maybe measured using reverse-transcription-based PCR assays, e.g., usingprimers specific for the differentially expressed gene sequences.Expression may also be determined at the protein level, i.e., bymeasuring the level of a polypeptides encoded by a gene describedherein, or the biological activity thereof. Such methods are well knownin the art and include, but are not limited to, e.g., immunoassays thatutilize antibodies to proteins encoded by the genes. The biologicalactivities of the proteins encoded by the genes are generally wellknown.

Diagnosing Breast Cancer:

In the context of the present invention, BRC is diagnosed by measuringthe expression level of one or more BRC nucleic acids from a testpopulation of cells, (i.e., a patient-derived biological sample).Preferably, the test cell population contains an epithelial cell, e.g.,a cell obtained from breast tissue. Gene expression can also be measuredfrom blood or other bodily fluids such as urine. Other biologicalsamples can be used for measuring protein levels. For example, theprotein level in blood or serum derived from a subject to be diagnosedcan be measured by immunoassay or other conventional biological assay.

Expression of one or more BRC-associated genes, e.g., genes listed inTables 3-8, is determined in the test cell or biological sample andcompared to the normal control expression level associated with the oneor more BRC-associated gene(s) assayed. A normal control level is anexpression profile of a BRC-associated gene typically found in apopulation known not to be suffering from BRC. An alteration (e.g., anincrease or decrease) in the level of expression in the patient-derivedtissue sample of one or more BRC-associated gene indicates that thesubject is suffering from or is at risk of developing BRC. For example,an increase in the expression of one or more up-regulated BRC-associatedgenes listed in Tables 3, 5, and 7 in the test population as compared tothe normal control level indicates that the subject is suffering from oris at risk of developing BRC. Conversely, a decrease in expression ofone or more down-regulated BRC-associated genes listed in Tables 4, 6,and 8 in the test population as compared to the normal control levelindicates that the subject is suffering from or is at risk of developingBRC.

Alteration of one or more of the BRC-associated genes in the testpopulation as compared to the normal control level indicates that thesubject suffers from or is at risk of developing BRC. For example,alteration of at least 1%, at least 5%, at least 25%, at least 50%, atleast 60%, at least 80%, at least 90% or more of the panel ofBRC-associated genes (genes listed in Tables 3-8) indicates that thesubject suffers from or is at risk of developing BRC.

Identifying Histopathological Differentiation of BRC:

The present invention provides a method for identifyinghistopathological differentiation of BRC in a subject, the methodcomprising the steps of:

-   -   (a) detecting an expression level of one or more marker genes in        a tissue sample collected from the subject being tested, wherein        the one or more marker genes are selected from the group        consisting of genes listed in Tables 1 and 10; and    -   (b) comparing the detected expression level of the one or more        marker genes to an expression level associated with a        well-differentiated case and poorly-differentiated case;    -   (c) such that when the detected expression level of one or more        marker genes is similar to that of the well-differentiated case,        the tissue sample is determined to be well-differentiated and        when the detected expression level of one or marker genes is        similar to that of the poorly-differentiated case, the tissue        sample is determined to be poorly-differentiated.

In the present invention, marker gene(s) for identifyinghistopathological differentiation of BRC may be at least one geneselected from the group consisting of 231 genes shown in Tables 1 and10. The nucleotide sequences of the genes and amino acid sequencesencoded thereby are known in the art. See Tables 1 and 10 for theAccession Numbers of the genes.

Identifying Agents that Inhibit or Enhance BRC-Associated GeneExpression:

An agent that inhibits the expression of a BRC-associated gene or theactivity of its gene product can be identified by contacting a test cellpopulation expressing a BRC-associated up-regulated gene with a testagent and then determining the expression level of the BRC-associatedgene or the activity of its gene product. A decrease in the level ofexpression of the BRC-associated gene or in the level of activity of itsgene product in the presence of the agent as compared to the expressionor activity level in the absence of the test agent indicates that theagent is an inhibitor of a BRC-associated up-regulated gene and usefulin inhibiting BRC.

Alternatively, an agent that enhances the expression of a BRC-associateddown-regulated gene or the activity of its gene product can beidentified by contacting a test cell population expressing aBRC-associated gene with a test agent and then determining theexpression level or activity of the BRC-associated down-regulated gene.An increase in the level of expression of the BRC-associated gene or inthe level of activity of its gene product as compared to the expressionor activity level in the absence of the test agent indicates that thetest agent augments expression of the BRC-associated down-regulated geneor the activity of its gene product.

The test cell population may be any cell expressing the BRC-associatedgenes. For example, the test cell population may contain an epithelialcell, such as a cell derived from breast tissue. Furthermore, the testcell may be an immortalized cell line derived from an carcinoma cell.Alternatively, the test cell may be a cell which has been transfectedwith a BRC-associated gene or which has been transfected with aregulatory sequence (e.g. promoter sequence) from a BRC-associated geneoperably linked to a reporter gene.

Assessing Efficacy of Treatment of BRC in a Subject:

The differentially expressed BRC-associated genes identified herein alsoallow for the course of treatment of BRC to be monitored. In thismethod, a test cell population is provided from a subject undergoingtreatment for BRC. If desired, test cell populations are obtained fromthe subject at various time points, before, during, and/or aftertreatment. Expression of one or more of the BRC-associated genes in thecell population is then determined and compared to a reference cellpopulation which includes cells whose BRC state is known. In the contextof the present invention, the reference cells should have not beenexposed to the treatment of interest.

If the reference cell population contains no BRC cells, a similarity inthe expression of a BRC-associated gene in the test cell population andthe reference cell population indicates that the treatment of interestis efficacious. However, a difference in the expression of aBRC-associated gene in the test population and a normal controlreference cell population indicates a less favorable clinical outcome orprognosis. Similarly, if the reference cell population contains BRCcells, a difference between the expression of a BRC-associated gene inthe test cell population and the reference cell population indicatesthat the treatment of interest is efficacious, while a similarity in theexpression of a BRC-associated gene in the test population and a cancercontrol reference cell population indicates a less favorable clinicaloutcome or prognosis.

Additionally, the expression level of one or more BRC-associated genesdetermined in a subject-derived biological sample obtained aftertreatment (i.e., post-treatment levels) can be compared to theexpression level of the one or more BRC-associated genes determined in asubject-derived biological sample obtained prior to treatment onset(i.e., pre-treatment levels). If the BRC-associated gene is anup-regulated gene, a decrease in the expression level in apost-treatment sample indicates that the treatment of interest isefficacious while an increase or maintenance in the expression level inthe post-treatment sample indicates a less favorable clinical outcome orprognosis. Conversely, if the BRC-associated gene is an down-regulatedgene, an increase in the expression level in a post-treatment sample mayindicate that the treatment of interest is efficacious while an decreaseor maintenance in the expression level in the post-treatment sampleindicates a less favorable clinical outcome or prognosis.

As used herein, the term “efficacious” indicates that the treatmentleads to a reduction in the expression of a pathologically up-regulatedgene, an increase in the expression of a pathologically down-regulatedgene or a decrease in size, prevalence, or metastatic potential ofbreast ductal carcinoma in a subject. When a treatment of interest isapplied prophylactically, the term “efficacious” means that thetreatment retards or prevents a breast tumor from forming or retards,prevents, or alleviates a symptom of clinical BRC. Assessment of breasttumors can be made using standard clinical protocols.

In addition, efficaciousness can be determined in association with anyknown method for diagnosing or treating BRC. BRC can be diagnosed, forexample, by identifying symptomatic anomalies, e.g., weight loss,abdominal pain, back pain, anorexia, nausea, vomiting and generalizedmalaise, weakness, and jaundice.

Selecting a Therapeutic Agent for Treating BRC that is Appropriate for aParticular Individual:

Differences in the genetic makeup of individuals can result indifferences in their relative abilities to metabolize various drugs. Anagent that is metabolized in a subject to act as an anti-BRC agent canmanifest itself by inducing a change in a gene expression pattern in thesubject's cells from that characteristic of a cancerous state to a geneexpression pattern characteristic of a non-cancerous state. Accordingly,the differentially expressed BRC-associated genes disclosed herein allowfor a putative therapeutic or prophylactic inhibitor of BRC to be testedin a test cell population from a selected subject in order to determineif the agent is a suitable inhibitor of BRC in the subject.

To identify an inhibitor of BRC that is appropriate for a specificsubject, a test cell population from the subject is exposed to atherapeutic agent, and the expression of one or more of BRC-associatedgenes listed in Table 3-8 is determined.

In the context of the method of the present invention, the test cellpopulation contains a BRC cell expressing a BRC-associated gene.Preferably, the test cell is an epithelial cell. For example, a testcell population may be incubated in the presence of a candidate agentand the pattern of gene expression of the test cell population may bemeasured and compared to one or more reference profiles, e.g., a BRCreference expression profile or a non-BRC reference expression profile.

A decrease in expression of one or more of the BRC-associated geneslisted in Tables 3, 5, and 7 or an increase in expression of one or moreof the BRC-associated genes listed in Tables 4, 6, and 8 in a test cellpopulation relative to a reference cell population containing BRCindicates that the agent has therapeutic potential.

In the context of the present invention, the test agent can be anycompound or composition. Exemplary test agents include, but are notlimited to, immunomodulatory agents.

Screening Assays for Identifying Therapeutic Agents:

The differentially expressed BRC-associated genes disclosed herein canalso be used to identify candidate therapeutic agents for treating BRC.The method of the present invention involves screening a candidatetherapeutic agent to determine if it can convert an expression profileof one or more BRC-associated genes listed in Tables 3-8 characteristicof a BRC state to a gene expression pattern characteristic of a non-BRCstate.

In the instant method, a cell is exposed to a test agent or a pluralityof test agents (sequentially or in combination) and the expression ofone or more of the BRC-associated genes listed in Tables 3-8 in the cellis measured. The expression profile of the BRC-associated gene(s)assayed in the test population is compared to expression level of thesame BRC-associated gene(s) in a reference cell population that is notexposed to the test agent.

An agent capable of stimulating the expression of an under-expressedgene or suppressing the expression of an over-expressed genes haspotential clinical benefit. Such agents may be further tested for theability to prevent breast ductal carcinomal growth in animals or testsubjects.

In a further embodiment, the present invention provides methods forscreening candidate agents which act on the potential targets in thetreatment of BRC. As discussed in detail above, by controlling theexpression levels of marker genes or the activities of their geneproducts, one can control the onset and progression of BRC. Thus,candidate agents, which act on the potential targets in the treatment ofBRC, can be identified through screening methods that use suchexpression levels and activities as indices of the cancerous ornon-cancerous state. In the context of the present invention, suchscreening may comprise, for example, the following steps:

-   -   a) contacting a test compound with a polypeptide encoded by a        polynucleotide selected from the group consisting of the genes        listed in Table 3, 4, 5, 6, 7 or 8;    -   b) detecting the binding activity between the polypeptide and        the test compound; and    -   c) selecting the test compound that binds to the polypeptide.

Alternatively, the screening method of the present invention maycomprise the following steps:

-   -   a) contacting a candidate compound with a cell expressing one or        more marker genes, wherein the one or more marker genes are        selected from the group consisting of the genes listed in Table        3, 4, 5, 6, 7 or 8; and    -   b) selecting the candidate compound that reduces the expression        level of one or more marker genes selected from the group        consisting of the genes listed in Table 3, 5, and 7, or elevates        the expression level of one or more marker genes selected from        the group consisting of the genes listed in Table 4, 6 and 8.        Cells expressing a marker gene include, for example, cell lines        established from BRC; such cells can be used for the above        screening of the present invention.

Alternatively, the screening method of the present invention maycomprise the following steps:

-   -   a) contacting a test compound with a polypeptide encoded by a        polynucleotide selected from the group consisting of the genes        listed in Table 3, 4, 5, 6, 7 or 8;    -   b) detecting the biological activity of the polypeptide of step        (a); and    -   c) selecting a compound that suppresses the biological activity        of the polypeptide encoded by the polynucleotide selected from        the group consisting of the genes listed in Table 3, and 7 as        compared to the biological activity detected in the absence of        the test compound, or enhances the biological activity of the        polypeptide encoded by the polynucleotide selected from the        group consisting of the genes listed in Table 4, 6 and 8 as        compared to the biological activity detected in the absence of        the test compound.

A protein for use in the screening method of the present invention canbe obtained as a recombinant protein using the nucleotide sequence ofthe marker gene. Based on the information regarding the marker gene andits encoded protein, one skilled in the art can select any biologicalactivity of the protein as an index for screening and any suitablemeasurement method to assay for the selected biological activity.

Alternatively, the screening method of the present invention maycomprise the following steps:

-   -   a) contacting a candidate compound with a cell into which a        vector, comprising the transcriptional regulatory region of one        or more marker genes and a reporter gene that is expressed under        the control of the transcriptional regulatory region, has been        introduced, wherein the one or more marker genes are selected        from the group consisting of the genes listed in Table 3, 4, 5,        6, 7 or 8;    -   b) measuring the expression or activity of said reporter gene;        and    -   c) selecting the candidate compound that reduces the expression        or activity of said reporter gene when said marker gene is an        up-regulated marker gene selected from the group consisting of        the genes listed in Table 3, 5 and 7, or that enhances the        expression level of said reporter gene when said marker gene is        a down-regulated marker gene selected from the group consisting        of the genes listed in Table 4, 6 and 8, as compared to a        control.

Suitable reporter genes and host cells are well known in the art. Areporter construct suitable for the screening method of the presentinvention can be prepared by using the transcriptional regulatory regionof a marker gene. When the transcriptional regulatory region of themarker gene is known to those skilled in the art, a reporter constructcan be prepared by using the previous sequence information. When thetranscriptional regulatory region of the marker gene remainsunidentified, a nucleotide segment containing the transcriptionalregulatory region can be isolated from a genome library based on thenucleotide sequence information of the marker gene.

A compound isolated by the screening serves as a candidate for thedevelopment of drugs that inhibit the expression of the marker gene orthe activity of the protein encoded by the marker gene and can beapplied to the treatment or prevention of breast cancer.

Moreover, compounds in which a part of the structure of the compoundinhibiting the activity of proteins encoded by marker genes is convertedby addition, deletion and/or replacement are also included as thecompounds obtainable by the screening method of the present invention.

When administrating a compound isolated by the method of the presentinvention as a pharmaceutical for humans and other mammals, such asmice, rats, guinea-pigs, rabbits, cats, dogs, sheep, pigs, cattle,monkeys, baboons, and chimpanzees, the isolated compound can be directlyadministered or can be formulated into a dosage form using knownpharmaceutical preparation methods. For example, according to the need,the drugs can be taken orally, as sugar-coated tablets, capsules,elixirs and microcapsules, or non-orally, in the form of injections ofsterile solutions or suspensions with water or any otherpharmaceutically acceptable liquid. For example, the compounds can bemixed with pharmaceutically acceptable carriers or media, specifically,sterilized water, physiological saline, plant-oils, emulsifiers,suspending agents, surfactants, stabilizers, flavoring agents,excipients, vehicles, preservatives, binders, and such, in a unit doseform required for generally accepted drug implementation. The amount ofactive ingredient contained in such a preparation makes a suitabledosage within the indicated range acquirable.

Examples of additives that can be admixed into tablets and capsulesinclude, but are not limited to, binders, such as gelatin, corn starch,tragacanth gum and arabic gum; excipients, such as crystallinecellulose; swelling agents, such as corn starch, gelatin and alginicacid; lubricants, such as magnesium stearate; sweeteners, such assucrose, lactose or saccharin; and flavoring agents, such as peppermint,Gaultheria adenothrix oil and cherry. When the unit-dose form is acapsule, a liquid carrier, such as an oil, can be further included inthe above ingredients. Sterile composites for injection can beformulated following normal drug implementations using vehicles, such asdistilled water, suitable for injection.

Physiological saline, glucose, and other isotonic liquids, includingadjuvants, such as D-sorbitol, D-mannnose, D-mannitol, and sodiumchloride, can be used as aqueous solutions for injection. These can beused in conjunction with suitable solubilizers, such as alcohol, forexample, ethanol; polyalcohols, such as propylene glycol andpolyethylene glycol; and non-ionic surfactants, such as Polysorbate 80(TM) and HCO-50.

Sesame oil or soy-bean oil can be used as an oleaginous liquid, may beused in conjunction with benzyl benzoate or benzyl alcohol as asolubilizer, and may be formulated with a buffer, such as phosphatebuffer and sodium acetate buffer; a pain-killer, such as procainehydrochloride; a stabilizer, such as benzyl alcohol and phenol; and/oran anti-oxidant. A prepared injection may be filled into a suitableampoule.

Methods well known to those skilled in the art may be used to administerthe pharmaceutical composition of the present invention to patients, forexample as an intraarterial, intravenous, or percutaneous injection oras an intranasal, transbronchial, intramuscular or oral administration.The dosage and method of administration vary according to thebody-weight and age of a patient and the administration method; however,one skilled in the art can routinely select a suitable method ofadministration. If said compound is encodable by a DNA, the DNA can beinserted into a vector for gene therapy and the vector administered to apatient to perform the therapy. The dosage and method of administrationvary according to the body-weight, age, and symptoms of the patient;however, one skilled in the art can suitably select them.

For example, although the dose of a compound that binds to a protein ofthe present invention and regulates its activity depends on thesymptoms, the dose is generally about 0.1 mg to about 100 mg per day,preferably about 1.0 mg to about 50 mg per day and more preferably about1.0 mg to about 20 mg per day, when administered orally to a normaladult human (weight 60 kg).

When administering the compound parenterally, in the form of aninjection to a normal adult human (weight 60 kg), although there aresome differences according to the patient, target organ, symptoms andmethod of administration, it is convenient to intravenously inject adose of about 0.01 mg to about 30 mg per day, preferably about 0.1 toabout 20 mg per day and more preferably about 0.1 to about 10 mg perday. In the case of other animals, the appropriate dosage amount may beroutinely calculated by converting to 60 kgs of body-weight.

Screening Assays for Identifying Therapeutic Agents for Metastasis ofBreast Cancer:

The present invention provides target molecules for treating orpreventing breast cancer metastasis. Screening assays for BRC metastasisof the present invention can be performed according to the method forBRC described above, using marker genes associated with BRC metastasis.

In the present invention, marker genes selected from the groupconsisting of genes listed in Table 11 are useful for the screening. 34genes shown in the Table are associated with lymph node metastasis.Among the genes, 25 genes (+) were relatively up-regulated and 9 genes(−) were down-regulated in node-positive tumors (Table 11 and FIG. 10).An agent that suppresses the expression of one or more of up-regulatedgenes or the activity of their gene products obtained by the presentinvention are useful for treating or preventing BRC with lymph-nodemetastasis. Alternatively, an agent that enhances the expression of oneor more down-regulated genes or the activity of their gene productsobtained by the present invention are also useful for treating orpreventing BRC with lymph-node metastasis.

In the present invention, the agent regulating an expression level ofgenes listed in Table 11 can be identified by the same manner foridentifying agents that inhibit or enhance BRC-associated geneexpression. Alternatively, the agent regulating the activity of theirgene products can be also identified by the same manner for identifyingagents that inhibit or enhance BRC-associated gene product.

Assessing the Prognosis of a Subject with Breast Cancer:

The present invention also provides a method of assessing the prognosisof a subject with BRC including the step of comparing the expression ofone or more BRC-associated genes in a test cell population to theexpression of the same BRC-associated genes in a reference cellpopulation derived from patients over a spectrum of disease stages. Bycomparing the gene expression of one or more BRC-associated genes in thetest cell population and the reference cell population(s), or bycomparing the pattern of gene expression over time in test cellpopulations derived from the subject, the prognosis of the subject canbe assessed.

For example, an increase in the expression of one or more ofup-regulated BRC-associated genes, such as those listed in Table 3, 5 or7, as compared to a normal control or a decrease in the expression ofone or more of down-regulated BRC-associated genes, such as those listedin Table 4, 6 or 8, as compared to a normal control indicates lessfavorable prognosis. Conversely, a similarity in the expression of oneor more of BRC-associated genes listed in Tables 3-8 as compared tonormal control indicates a more favorable prognosis for the subject.Preferably, the prognosis of a subject can be assessed by comparing theexpression profile of the gene selected from the group consisting ofgenes listed in Table 3, 4, 5, 6, 7 and 8. The classification score (CS)may be used for comparing the expression profile.

Kits:

The present invention also includes a BRC-detection reagent, e.g., anucleic acid that specifically binds to or identifies one or more BRCnucleic acids, such as oligonucleotide sequences which are complementaryto a portion of a BRC nucleic acid, or an antibody that bind to one ormore proteins encoded by a BRC nucleic acid. The detection reagents maybe packaged together in the form of a kit. For example, the detectionreagents may be packaged in separate containers, e.g., a nucleic acid orantibody (either bound to a solid matrix or packaged separately withreagents for binding them to the matrix), a control reagent (positiveand/or negative), and/or a detectable label. Instructions (e.g.,written, tape, VCR, CD-ROM, etc.) for carrying out the assay may also beincluded in the kit. The assay format of the kit may be a Northernhybridization or a sandwich ELISA, both of which are known in the art.

For example, a BRC detection reagent may be immobilized on a solidmatrix, such as a porous strip, to form at least one BRC detection site.The measurement or detection region of the porous strip may include aplurality of sites, each containing a nucleic acid. A test strip mayalso contain sites for negative and/or positive controls. Alternatively,control sites may be located on a separate strip from the test strip.Optionally, the different detection sites may contain different amountsof immobilized nucleic acids, i.e., a higher amount in the firstdetection site and lesser amounts in subsequent sites. Upon the additionof test sample, the number of sites displaying a detectable signalprovides a quantitative indication of the amount of BRC present in thesample. The detection sites may be configured in any suitably detectableshape and are typically in the shape of a bar or dot spanning the widthof a test strip.

Alternatively, the kit may contain a nucleic acid substrate arraycomprising one or more nucleic acids. The nucleic acids on the arrayspecifically identify one or more nucleic acid sequences represented bythe BRC-associated genes listed in Tables 3-8. The expression of 2, 3,4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 40 or 50 or more of the nucleic acidsrepresented by the BRC-associated genes listed in Tables 3-8 may beidentified by virtue of the level of binding to an array test strip orchip. The substrate array can be on, e.g., a solid substrate, such as a“chip” described in U.S. Pat. No. 5,744,305, the contents of which areincorporated by reference herein in its entirety.

Arrays and Pluralities:

The present invention also includes a nucleic acid substrate arraycomprising one or more nucleic acids. The nucleic acids on the arrayspecifically correspond to one or more nucleic acid sequencesrepresented by the BRC-associated genes listed in Tables 3-8. The levelof expression of 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 40 or 50 ormore of the nucleic acids represented by the BRC-associated genes listedin Tables 3-8 may be identified by detecting nucleic acid binding to thearray.

The present invention also includes an isolated plurality (i.e., amixture of two or more nucleic acids) of nucleic acids. The nucleicacids may be in a liquid phase or a solid phase, e.g., immobilized on asolid support such as a nitrocellulose membrane. The plurality includesone or more of the nucleic acids represented by the BRC-associated geneslisted in Tables 3-8. In various embodiments, the plurality includes 2,3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 40 or 50 or more of the nucleicacids represented by the BRC-associated genes listed in Tables 3-8.

Methods of Inhibiting Breast Cancer:

The present invention further provides a method for treating oralleviating a symptom of BRC in a subject by decreasing the expressionof one or more of the BRC-associated genes listed in Tables 3, 5, and 7(or the activity of its gene product) or increasing the expression ofone or more of the BRC-associated genes listed in Tables 4, 6, and 8 (orthe activity of its gene product). Suitable therapeutic compounds can beadministered prophylactically or therapeutically to a subject sufferingfrom or at risk of (or susceptible to) developing BRC. Such subjects canbe identified using standard clinical methods or by detecting anaberrant level of expression of one or more of the BRC-associated geneslisted in Tables 3-8 or aberrant activity of its gene product. In thecontext of the present invention, suitable therapeutic agents include,for example, inhibitors of cell cycle regulation, cell proliferation,and protein kinase activity.

The therapeutic method of the present invention includes the step ofincreasing the expression, function, or both of one or more geneproducts of genes whose expression is decreased (“down-regulated” or“under-expressed” genes) in a BRC cell relative to normal cells of thesame tissue type from which the BRC cells are derived. In these methods,the subject is treated with an effective amount of a compound thatincreases the amount of one or more of the under-expressed(down-regulated) genes in the subject. Administration can be systemic orlocal. Suitable therapeutic compounds include a polypeptide product ofan under-expressed gene, a biologically active fragment thereof, and anucleic acid encoding an under-expressed gene and having expressioncontrol elements permitting expression in the BRC cells; for example, anagent that increases the level of expression of such a gene endogenousto the BRC cells (i.e., which up-regulates the expression of theunder-expressed gene or genes). Administration of such compoundscounters the effects of aberrantly under-expressed gene or genes in thesubject's breast cells and improves the clinical condition of thesubject.

Alternatively, the therapeutic method of the present invention mayinclude the step of decreasing the expression, function, or both, of oneor more gene products of genes whose expression is aberrantly increased(“up-regulated” or “over-expressed” gene) in breast cells. Expressionmay be inhibited in any of several ways known in the art. For example,expression can be inhibited by administering to the subject a nucleicacid that inhibits, or antagonizes the expression of the over-expressedgene or genes, e.g., an antisense oligonucleotide or small interferingRNA which disrupts expression of the over-expressed gene or genes.

Antisense Nucleic Acids:

As noted above, antisense nucleic acids corresponding to the nucleotidesequence of the BRC-associated genes listed in Tables 3, 5, and 7 can beused to reduce the expression level of the genes. Antisense nucleicacids corresponding to the BRC-associated genes listed in Tables 3, 5,and 7 that are up-regulated in breast cancer are useful for thetreatment of breast cancer. Specifically, the antisense nucleic acids ofthe present invention may act by binding to the BRC-associated geneslisted in Tables 3, 5, and 7, or mRNAs corresponding thereto, therebyinhibiting the transcription or translation of the genes, promoting thedegradation of the mRNAs, and/or inhibiting the expression of proteinsencoded by the BRC-associated genes listed in Tables 3, 5, and 7,thereby, inhibiting the function of the proteins. The term “antisensenucleic acids” as used herein encompasses both nucleotides that areentirely complementary to the target sequence and those having amismatch of one or more nucleotides, so long as the antisense nucleicacids can specifically hybridize to the target sequences. For example,the antisense nucleic acids of the present invention includepolynucleotides that have a homology of at least 70% or higher,preferably at least 80% or higher, more preferably at least 90% orhigher, even more preferably at least 95% or higher over a span of atleast 15 continuous nucleotides. Algorithms known in the art can be usedto determine the homology.

The antisense nucleic acid of the present invention act on cellsproducing the proteins encoded by BRC-associated marker genes by bindingto the DNAs or mRNAs encoding the proteins, inhibiting theirtranscription or translation, promoting the degradation of the mRNAs,and inhibiting the expression of the proteins, thereby resulting in theinhibition of the protein function.

An antisense nucleic acid of the present invention can be made into anexternal preparation, such as a liniment or a poultice, by admixing itwith a suitable base material which is inactive against the nucleicacid.

Also, as needed, the antisense nucleic acids of the present inventioncan be formulated into tablets, powders, granules, capsules, liposomecapsules, injections, solutions, nose-drops and freeze-drying agents byadding excipients, isotonic agents, solubilizers, stabilizers,preservatives, pain-killers, and such. These can be prepared byfollowing known methods.

The antisense nucleic acids of the present invention can be given to thepatient by direct application onto the ailing site or by injection intoa blood vessel so that it will reach the site of ailment. Anantisense-mounting medium can also be used to increase durability andmembrane-permeability. Examples include, but are not limited to,liposomes, poly-L-lysine, lipids, cholesterol, lipofectin or derivativesof these.

The dosage of the antisense nucleic acid derivative of the presentinvention can be adjusted suitably according to the patient's conditionand used in desired amounts. For example, a dose range of 0.1 to 100mg/kg, preferably 0.1 to 50 mg/kg can be administered.

The antisense nucleic acids of the present invention inhibit theexpression of a protein of the present invention and are thereby usefulfor suppressing the biological activity of the protein of the invention.In addition, expression-inhibitors, comprising antisense nucleic acidsof the present invention, are useful in that they can inhibit thebiological activity of a protein of the present invention.

The method of the present invention can be used to alter the expressionin a cell of an up-regulated BRC-associated gene, e.g., up-regulationresulting from the malignant transformation of the cells. Binding of thesiRNA to a transcript corresponding to one of the BRC-associated geneslisted in Tables 3, 5, and 7 in the target cell results in a reductionin the protein production by the cell. The length of the oligonucleotideis at least 10 nucleotides and may be as long as the naturally-occurringtranscript. Preferably, the oligonucleotide is 19-25 nucleotides inlength. Most preferably, the oligonucleotide is less than 75, 50, 25nucleotides in length.

The antisense nucleic acids of present invention include modifiedoligonucleotides. For example, thioated oligonucleotides may be used toconfer nuclease resistance to an oligonucleotide.

Also, an siRNA against a marker gene can be used to reduce theexpression level of the marker gene. Herein, term “siRNA” refers to adouble stranded RNA molecule which prevents translation of a targetmRNA. Standard techniques for introducing siRNA into the cell may beused, including those in which DNA is a template from which RNA istranscribed. In the context of the present invention, the siRNAcomprises a sense nucleic acid sequence and an anti-sense nucleic acidsequence against an up-regulated marker gene, such as a BRC-associatedgene listed in Tables 3, 5, and 7. The siRNA is constructed such that asingle transcript has both the sense and complementary antisensesequences from the target gene, e.g., a hairpin.

An siRNA of a BRC-associated gene, such as listed in Tables 3, 5, and 7,hybridizes to target mRNA and thereby decreases or inhibits productionof the polypeptides encoded by BRC-associated gene listed in Tables 3,5, and 7 by associating with the normally single-stranded mRNAtranscript, thereby interfering with translation and thus, expression ofthe protein. In the context of the present invention, an siRNA ispreferably less than 500, 200, 100, 50, or 25 nucleotides in length.More preferably an siRNA is 19-25 nucleotides in length. Exemplarynucleic acid sequence for the production of TOPK siRNA includes thesequences of nucleotides of SEQ ID NOs: 25, 28 and 31 as the targetsequence. In order to enhance the inhibition activity of the siRNA,nucleotide “u” can be added to 3′ end of the antisense strand of thetarget sequence. The number of “u”s to be added is at least 2, generally2 to 10, preferably 2 to 5 (SEQ ID NO:32). The added “u”s form singlestrand at the 3′ end of the antisense strand of the siRNA.

An siRNA of a BRC-associated gene, such as listed in Tables 3, 5, and 7,can be directly introduced into the cells in a form that is capable ofbinding to the mRNA transcripts. Alternatively, a DNA encoding the siRNAmay be carried in a vector.

Vectors may be produced, for example, by cloning a BRC-associated genetarget sequence into an expression vector having operatively-linkedregulatory sequences flanking the sequence in a manner that allows forexpression (by transcription of the DNA molecule) of both strands (Lee,N. S., Dohjima, T., Bauer, G., Li, H., Li, M.-J., Ehsani, A.,Salvaterra, P., and Rossi, J. (2002) Expression of small interferingRNAs targeted against HIV-1 rev transcripts in human cells. NatureBiotechnology 20:500-505). An RNA molecule that is antisense to mRNA ofa BRC-associated gene is transcribed by a first promoter (e.g., apromoter sequence 3′ of the cloned DNA) and an RNA molecule that is thesense strand for the mRNA of a BRC-associated gene is transcribed by asecond promoter (e.g., a promoter sequence 5′ of the cloned DNA). Thesense and antisense strands hybridize in vivo to generate siRNAconstructs for silencing of the BRC-associated gene. Alternatively, thetwo constructs can be utilized to create the sense and anti-sensestrands of a siRNA construct. Cloned BRC-associated genes can encode aconstruct having secondary structure, e.g., hairpins, wherein a singletranscript has both the sense and complementary antisense sequences fromthe target gene.

A loop sequence consisting of an arbitrary nucleotide sequence can belocated between the sense and antisense sequence in order to form thehairpin loop structure. Thus, the present invention also provides siRNAhaving the general formula 5′-[A]-[B]-[A′]-3′, wherein [A] is aribonucleotide sequence corresponding to a sequence of gene selectedfrom Table 3, 5 or 7,

[B] is a ribonucleotide sequence consisting of 3 to 23 nucleotides, and

[A′] is a ribonucleotide sequence consisting of the complementarysequence of [A]. The region [A] hybridizes to [A′], and then a loopconsisting of region [B] is formed. The loop sequence may be preferably3 to 23 nucleotide in length. The loop sequence, for example, can beselected from group consisting of following sequences (atambion.com/techlib/tb/tb_(—)506.html. Furthermore, loop sequenceconsisting of 23 nucleotides also provides active siRNA (Jacque, J.-M.,Triques, K., and Stevenson, M. (2002) Modulation of HIV-1 replication byRNA interference. Nature 418: 435-438).

CCC, CCACC or CCACACC: Jacque, J. M, Triques, K., and Stevenson, M(2002) Modulation of HIV-1 replication by RNA interference. Nature, Vol.418: 435-438.

UUCG: Lee, N. S., Dohjima, T., Bauer, G., Li, H., Li, M.-J., Ehsani, A.,Salvaterra, P., and Rossi, J. (2002) Expression of small interferingRNAs targeted against HIV-1 rev transcripts in human cells. NatureBiotechnology 20:500-505. Fruscoloni, P., Zamboni, M., andTocchini-Valentini, G. P. (2003) Exonucleolytic degradation ofdouble-stranded RNA by an activity in Xenopus laevis germinal vesicles.Proc. Natl. Acad. Sci. USA 100(4): 1639-1644.

UUCAAGAGA: Dykxhoom, D. M., Novina, C. D., and Sharp, P. A. (2002)Killing the messenger: Short RNAs that silence gene expression. NatureReviews Molecular Cell Biology 4: 457-467.

Accordingly, the loop sequence can be selected from group consisting of,CCC, UUCG, CCACC, CCACACC, and UUCAAGAGA. Preferable loop sequence isUUCAAGAGA (“ttcaagaga” in DNA). Exemplary hairpin siRNA suitable for usein the context of the present invention include:

for TOPK-siRNA

gaacgauauaaagccagcc-[b]-ggcuggcuuuauaucguuc (SEQ ID NOS:33-37 for targetsequence of SEQ ID NO: 25);cuggaugaaucauaccaga-[b]-ucugguaugauucauccag (SEQ ID NOS:38-42 for targetsequence of SEQ ID NO: 28);guguggcuugcguaaauaa-[b]-uuauuuacgcaagccacac (SEQ ID NOS:43-47 for targetsequence of SEQ ID NO: 31)—

The nucleotide sequence of suitable siRNAs can be designed using ansiRNA design computer program available from the Ambion website (atambion.com/techlib/misc/siRNA finder.html). The computer program selectsnucleotide sequences for siRNA synthesis based on the followingprotocol.

Selection of siRNA Target Sites:

-   1. Beginning with the AUG start codon of the object transcript, scan    downstream for AA dinucleotide sequences. Record the occurrence of    each AA and the 3′ adjacent 19 nucleotides as potential siRNA target    sites. Tuschl, et al. doesn't recommend against designing siRNA to    the 5′ and 3′ untranslated regions (UTRs) and regions near the start    codon (within 75 bases) as these may be richer in regulatory protein    binding sites. UTR-binding proteins and/or translation initiation    complexes may interfere with binding of the siRNA endonuclease    complex.-   2. Compare the potential target sites to the human genome database    and eliminate from consideration any target sequences with    significant homology to other coding sequences. The homology search    can be performed using BLAST, which can be found on the NCBI server    at ncbi.nlm.nih.gov/BLAST/.-   3. Select qualifying target sequences for synthesis. At Ambion,    preferably several target sequences can be selected along the length    of the gene to evaluate.

The regulatory sequences flanking the BRC-associated gene sequences canbe identical or different, such that their expression can be modulatedindependently, or in a temporal or spatial manner. siRNAs aretranscribed intracellularly by cloning the BRC-associated genetemplates, respectively, into a vector containing, e.g., a RNA pol IIItranscription unit from the small nuclear RNA (snRNA) U6 or the human HIRNA promoter. For introducing the vector into the cell,transfection-enhancing agent can be used. FuGENE (Rochediagnostices),Lipofectamin 2000 (Invitrogen), Oligofectamin (Invitrogen), andNucleofactor (Wako pure Chemical) are useful as thetransfection-enhancing agent.

The antisense oligonucleotide or siRNA of the present invention inhibitsthe expression of a polypeptide of the present invention and is therebyuseful for suppressing the biological activity of a polypeptide of theinvention. Also, expression-inhibitors, comprising the antisenseoligonucleotide or siRNA of the invention, are useful in the point thatthey can inhibit the biological activity of the polypeptide of theinvention. Therefore, a composition comprising an antisenseoligonucleotide or siRNA of the present invention is useful for treatinga breast cancer.

Antibodies:

Alternatively, function of one or more gene products of the genesover-expressed in BRC can be inhibited by administering a compound thatbinds to or otherwise inhibits the function of the gene products. Forexample, the compound is an antibody which binds to the over-expressedgene product or gene products.

The present invention refers to the use of antibodies, particularlyantibodies against a protein encoded by an up-regulated marker gene, ora fragment of such an antibody. As used herein, the term “antibody”refers to an immunoglobulin molecule having a specific structure, thatinteracts (i.e., binds) only with the antigen that was used forsynthesizing the antibody (i.e., the gene product of an up-regulatedmarker) or with an antigen closely related thereto. Furthermore, anantibody may be a fragment of an antibody or a modified antibody, solong as it binds to one or more of the proteins encoded by the markergenes. For instance, the antibody fragment may be Fab, F(ab′)₂, Fv, orsingle chain Fv (scFv), in which Fv fragments from H and L chains areligated by an appropriate linker (Huston J. S. et al. Proc. Natl. Acad.Sci. U.S.A. 85:5879-5883 (1988)). More specifically, an antibodyfragment may be generated by treating an antibody with an enzyme, suchas papain or pepsin. Alternatively, a gene encoding the antibodyfragment may be constructed, inserted into an expression vector, andexpressed in an appropriate host cell (see, for example, Co M. S. et al.J. Immunol. 152:2968-2976 (1994); Better M. and Horwitz A. H. MethodsEnzymol. 178:476-496 (1989); Pluckthun A. and Skerra A. Methods Enzymol.178:497-515 (1989); Lamoyi E. Methods Enzymol. 121:652-663 (1986);Rousseaux J. et al. Methods Enzymol. 121:663-669 (1986); Bird R. E. andWalker B. W. Trends Biotechnol. 9:132-137 (1991)).

An antibody may be modified by conjugation with a variety of molecules,such as polyethylene glycol (PEG). The present invention provides suchmodified antibodies. The modified antibody can be obtained by chemicallymodifying an antibody. Such modification methods are conventional in thefield.

Alternatively, an antibody may comprise a chimeric antibody having avariable region derived from a nonhuman antibody and a constant regionderived from a human antibody, or a humanized antibody, comprising acomplementarity determining region (CDR) derived from a nonhumanantibody, a frame work region (FR) and a constant region derived from ahuman antibody. Such antibodies can be prepared by using knowntechnologies.

Cancer therapies directed at specific molecular alterations that occurin cancer cells have been validated through clinical development andregulatory approval of anti-cancer drugs such as trastuzumab (Herceptin)for the treatment of advanced breast cancer, imatinib methylate(Gleevec) for chronic myeloid leukemia, gefitinib (Iressa) for non-smallcell lung cancer (NSCLC), and rituximab (anti-CD20 mAb) for B-celllymphoma and mantle cell lymphoma (Ciardiello F, Tortora G. A novelapproach in the treatment of cancer: targeting the epidermal growthfactor receptor. Clin Cancer Res. 2001 October; 7(10):2958-70. Review.;Slamon D J, Leyland-Jones B, Shak S, Fuchs H, Paton V, Bajamonde A,Fleming T, Eiermann W, Wolter J, Pegram M, Baselga J, Norton L. Use ofchemotherapy plus a monoclonal antibody against HER2 for metastaticbreast cancer that overexpresses HER2. N Engl J. Med. 2001 Mar. 15;344(11):783-92; Rehwald U, Schulz H, Reiser M, Sieber M, Staak J O,Morschhauser F, Driessen C, Rudiger T, Muller-Hermelink K, Diehl V,Engert A. Treatment of relapsed CD20+ Hodgkin lymphoma with themonoclonal antibody rituximab is effective and well tolerated: resultsof a phase 2 trial of the German Hodgkin Lymphoma Study Group. Blood.2003 Jan. 15; 101(2):420-424; Fang G, Kim C N, Perkins C L, Ramadevi N,Winton E, Wittmann S and Bhalla K N. (2000). Blood, 96, 2246-2253).These drugs are clinically effective and better tolerated thantraditional anti-cancer agents because they target only transformedcells. Hence, such drugs not only improve survival and quality of lifefor cancer patients, but also validate the concept of molecularlytargeted cancer therapy. Furthermore, targeted drugs can enhance theefficacy of standard chemotherapy when used in combination with it(Gianni L (2002). Oncology, 63 Suppl 1, 47-56; Klejman A, Rushen L,Morrione A, Slupianek A and Skorski T. (2002). Oncogene, 21, 5868-5876).Therefore, future cancer treatments will probably involve combiningconventional drugs with target-specific agents aimed at differentcharacteristics of tumor cells such as angiogenesis and invasiveness.

These modulatory methods can be performed ex vivo or in vitro (e.g., byculturing the cell with the agent) or, alternatively, in vivo (e.g., byadministering the agent to a subject). The methods involve administeringa protein or combination of proteins or a nucleic acid molecule orcombination of nucleic acid molecules as therapy to counteract aberrantexpression of the differentially expressed genes or aberrant activity oftheir gene products.

Diseases and disorders that are characterized by increased (relative toa subject not suffering from the disease or disorder) expression levelsor biological activities of genes and gene products, respectively, maybe treated with therapeutics that antagonize (i.e., reduce or inhibit)activity of the over-expressed gene or genes. Therapeutics thatantagonize activity can be administered therapeutically orprophylactically.

Accordingly, therapeutics that may be utilized in the context of thepresent invention include, e.g., (i) a polypeptide of the over-expressedor under-expressed gene or genes, or analogs, derivatives, fragments orhomologs thereof; (ii) antibodies to the over-expressed gene or geneproducts; (iii) nucleic acids encoding the over-expressed orunder-expressed gene or genes; (iv) antisense nucleic acids or nucleicacids that are “dysfunctional” (i.e., due to a heterologous insertionwithin the nucleic acids of one or more over-expressed gene or genes);(v) small interfering RNA (siRNA); or (vi) modulators (i.e., inhibitors,agonists and antagonists that alter the interaction between anover-expressed or under-expressed polypeptide and its binding partner).The dysfunctional antisense molecules are utilized to “knockout”endogenous function of a polypeptide by homologous recombination (see,e.g., Capecchi, Science 244: 1288-1292 1989).

Diseases and disorders that are characterized by decreased (relative toa subject not suffering from the disease or disorder) biologicalactivity may be treated with therapeutics that increase (i.e., areagonists to) activity. Therapeutics that up-regulate activity may beadministered in a therapeutic or prophylactic manner. Therapeutics thatmay be utilized include, but are not limited to, a polypeptide (oranalogs, derivatives, fragments or homologs thereof) or an agonist thatincreases bioavailability.

Increased or decreased levels can be readily detected by quantifyingpeptide and/or RNA, by obtaining a patient tissue sample (e.g., frombiopsy tissue) and assaying it in vitro for RNA or peptide levels,structure and/or activity of the expressed peptides (or mRNAs of a genewhose expression is altered). Methods that are well-known within the artinclude, but are not limited to, immunoassays (e.g., by Western blotanalysis, immunoprecipitation followed by sodium dodecyl sulfate (SDS)polyacrylamide gel electrophoresis, immunocytochemistry, etc.) and/orhybridization assays to detect expression of mRNAs (e.g., Northernassays, dot blots, in situ hybridization, etc.).

Prophylactic administration occurs prior to the manifestation of overtclinical symptoms of disease, such that a disease or disorder isprevented or, alternatively, delayed in its progression.

Therapeutic methods of the present invention may include the step ofcontacting a cell with an agent that modulates one or more of theactivities of the gene products of the differentially expressed genes.Examples of agent that modulates protein activity include, but are notlimited to, nucleic acids, proteins, naturally-occurring cognate ligandsof such proteins, peptides, peptidomimetics, and other small molecule.For example, a suitable agent may stimulate one or more proteinactivities of one or more differentially under-expressed genes.

Vaccinating Against Breast Cancer:

The present invention also relates to a method of treating or preventingbreast cancer in a subject comprising the step of administering to saidsubject a vaccine comprising a polypeptide encoded by a nucleic acidselected from the group consisting of the BRC-associated genes listed inTables 3, 5, and 7 (i.e., up-regulated genes), an immunologically activefragment of said polypeptide, or a polynucleotide encoding such apolypeptide or fragment thereof. Administration of the polypeptideinduces an anti-tumor immunity in a subject. To induce anti-tumorimmunity, a polypeptide encoded by a nucleic acid selected from thegroup consisting of the BRC-associated genes listed in Tables 3, 5, and7, an immunologically active fragment of said polypeptide, or apolynucleotide encoding such a polypeptide or fragment thereof isadministered to subject in need thereof. Furthermore, the polypeptideencoded by a nucleic acid selected from the group consisting of theBRC-associated genes listed in Tables 5 and 7 may induce antitumorimmunity against invasion of breast cancer and IDC, respectively. Thepolypeptide or the immunologically active fragments thereof are usefulas vaccines against BRC. In some cases, the proteins or fragmentsthereof may be administered in a form bound to the T cell receptor (TCR)or presented by an antigen presenting cell (APC), such as macrophage,dendritic cell (DC), or B-cells. Due to the strong antigen presentingability of DC, the use of DC is most preferable among the APCs.

In the present invention, a vaccine against BRC refers to a substancethat has the ability to induce anti-tumor immunity upon inoculation intoanimals. According to the present invention, polypeptides encoded by theBRC-associated genes listed in Tables 3, 5, and 7, or fragments thereof,were suggested to be HLA-A24 or HLA-A*0201 restricted epitopes peptidesthat may induce potent and specific immune response against BRC cellsexpressing the BRC-associated genes listed in Tables 3, 5, and 7. Thus,the present invention also encompasses a method of inducing anti-tumorimmunity using the polypeptides. In general, anti-tumor immunityincludes immune responses such as follows:

-   -   induction of cytotoxic lymphocytes against tumors,    -   induction of antibodies that recognize tumors, and    -   induction of anti-tumor cytokine production.

Therefore, when a certain protein induces any one of these immuneresponses upon inoculation into an animal, the protein is determined tohave anti-tumor immunity inducing effect. The induction of theanti-tumor immunity by a protein can be detected by observing in vivo orin vitro the response of the immune system in the host against theprotein.

For example, a method for detecting the induction of cytotoxic Tlymphocytes is well known. Specifically, a foreign substance that entersthe living body is presented to T cells and B cells by the action ofantigen presenting cells (APCs). T cells that respond to the antigenpresented by the APCs in an antigen specific manner differentiate intocytotoxic T cells (or cytotoxic T lymphocytes; CTLs) due to stimulationby the antigen, and then proliferate (this is referred to as activationof T cells). Therefore, CTL induction by a certain peptide can beevaluated by presenting the peptide to a T cell via an APC, anddetecting the induction of CTLs. Furthermore, APCs have the effect ofactivating CD4+ T cells, CD8+ T cells, macrophages, eosinophils, and NKcells. Since CD4+ T cells and CD8+ T cells are also important inanti-tumor immunity, the anti-tumor immunity-inducing action of thepeptide can be evaluated using the activation effect of these cells asindicators.

A method for evaluating the inducing action of CTLs using dendriticcells (DCs) as the APC is well known in the art. DCs are arepresentative APCs having the strongest CTL-inducing action among APCs.In this method, the test polypeptide is initially contacted with DCs,and then the DCs are contacted with T cells. Detection of T cells havingcytotoxic effects against the cells of interest after the contact withDC shows that the test polypeptide has an activity of inducing thecytotoxic T cells. Activity of CTLs against tumors can be detected, forexample, using the lysis of ⁵¹Cr-labeled tumor cells as the indicator.Alternatively, the method of evaluating the degree of tumor cell damageusing ³H-thymidine uptake activity or LDH (lactosedehydrogenase)-release as the indicator is also well known.

Apart from DCs, peripheral blood mononuclear cells (PBMCs) may also beused as the APC. The induction of CTLs has been reported to be enhancedby culturing PBMCs in the presence of GM-CSF and IL-4. Similarly, CTLshave been shown to be induced by culturing PBMCs in the presence ofkeyhole limpet hemocyanin (KLH) and IL-7.

Test polypeptides confirmed to possess CTL-inducing activity by thesemethods are deemed to be polypeptides having DC activation effect andsubsequent CTL-inducing activity. Therefore, polypeptides that induceCTLs against tumor cells are useful as vaccines against tumors.Furthermore, APCs that have acquired the ability to induce CTLs againsttumors through contact with the polypeptides are also useful as vaccinesagainst tumors. Furthermore, CTLs that have acquired cytotoxicity due topresentation of the polypeptide antigens by APCs can be also be used asvaccines against tumors. Such therapeutic methods for tumors, usinganti-tumor immunity due to APCs and CTLs, are referred to as cellularimmunotherapy.

Generally, when using a polypeptide for cellular immunotherapy,efficiency of the CTL-induction is known to be increased by combining aplurality of polypeptides having different structures and contactingthem with DCs. Therefore, when stimulating DCs with protein fragments,it is advantageous to use a mixture of multiple types of fragments.

Alternatively, the induction of anti-tumor immunity by a polypeptide canbe confirmed by observing the induction of antibody production againsttumors. For example, when antibodies against a polypeptide are inducedin a laboratory animal immunized with the polypeptide, and when growthof tumor cells is suppressed by those antibodies, the polypeptide isdeemed to have the ability to induce anti-tumor immunity.

Anti-tumor immunity is induced by administering the vaccine of thisinvention, and the induction of anti-tumor immunity enables treatmentand prevention of BRC. Therapy against cancer or prevention of the onsetof cancer includes any of the following steps, such as inhibition of thegrowth of cancerous cells, involution of cancer, and suppression of theoccurrence of cancer. A decrease in mortality and morbidity ofindividuals having cancer, decrease in the levels of tumor markers inthe blood, alleviation of detectable symptoms accompanying cancer, andsuch are also included in the therapy or prevention of cancer. Suchtherapeutic and preventive effects are preferably statisticallysignificant. For example, in observation, at a significance level of 5%or less, wherein the therapeutic or preventive effect of a vaccineagainst cell proliferative diseases is compared to a control withoutvaccine administration. For example, Student's t-test, the Mann-WhitneyU-test, or ANOVA may be used for statistical analysis.

The above-mentioned protein having immunological activity or a vectorencoding the protein may be combined with an adjuvant. An adjuvantrefers to a compound that enhances the immune response against theprotein when administered together (or successively) with the proteinhaving immunological activity. Exemplary adjuvants include, but are notlimited to, cholera toxin, salmonella toxin, alum, and such, but are notlimited thereto. Furthermore, the vaccine of this invention may becombined appropriately with a pharmaceutically acceptable carrier.Examples of such carriers include sterilized water, physiologicalsaline, phosphate buffer, culture fluid, and such. Furthermore, thevaccine may contain as necessary, stabilizers, suspensions,preservatives, surfactants, and such. The vaccine can be administeredsystemically or locally. Vaccine administration can be performed bysingle administration, or boosted by multiple administrations.

When using an APC or CTL as the vaccine of this invention, tumors can betreated or prevented, for example, by the ex vivo method. Morespecifically, PBMCs of the subject receiving treatment or prevention arecollected, the cells are contacted with the polypeptide ex vivo, andfollowing the induction of APCs or CTLs, the cells may be administeredto the subject. APCs can be also induced by introducing a vectorencoding the polypeptide into PBMCs ex vivo. APCs or CTLs induced invitro can be cloned prior to administration. By cloning and growingcells having high activity of damaging target cells, cellularimmunotherapy can be performed more effectively. Furthermore, APCs andCTLs isolated in this manner may be used for cellular immunotherapy notonly against individuals from whom the cells are derived, but alsoagainst similar types of tumors from other individuals.

Furthermore, a pharmaceutical composition for treating or preventing acell proliferative disease, such as cancer, comprising apharmaceutically effective amount of the polypeptide of the presentinvention is provided. The pharmaceutical composition may be used forraising anti tumor immunity.

Pharmaceutical Compositions for Inhibiting BRC or Malignant BRC:

In the context of the present invention, suitable pharmaceuticalformulations include those suitable for oral, rectal, nasal, topical(including buccal and sub-lingual), vaginal or parenteral (includingintramuscular, sub-cutaneous and intravenous) administration, or foradministration by inhalation or insufflation. Preferably, administrationis intravenous. The formulations are optionally packaged in discretedosage units.

Pharmaceutical formulations suitable for oral administration includecapsules, cachets or tablets, each containing a predetermined amount ofactive ingredient. Suitable formulations also include powders, granules,solutions, suspensions and emulsions. The active ingredient isoptionally administered as a bolus electuary or paste. Tablets andcapsules for oral administration may contain conventional excipients,such as binding agents, fillers, lubricants, disintegrant and/or wettingagents. A tablet may be made by compression or molding, optionally withone or more formulational ingredients. Compressed tablets may beprepared by compressing in a suitable machine the active ingredients ina free-flowing form, such as a powder or granules, optionally mixed witha binder, lubricant, inert diluent, lubricating, surface active and/ordispersing agent. Molded tablets may be made by molding in a suitablemachine a mixture of the powdered compound moistened with an inertliquid diluent. The tablets may be coated according to methods wellknown in the art. Oral fluid preparations may be in the form of, forexample, aqueous or oily suspensions, solutions, emulsions, syrups orelixirs, or may be presented as a dry product for constitution withwater or other suitable vehicle before use. Such liquid preparations maycontain conventional additives, such as suspending agents, emulsifyingagents, non-aqueous vehicles (which may include edible oils), and/orpreservatives. The tablets may optionally be formulated so as to provideslow or controlled release of the active ingredient therein. A packageof tablets may contain one tablet to be taken on each of the month.

Formulations suitable for parenteral administration include aqueous andnon-aqueous sterile injection solutions, optionally containanti-oxidants, buffers, bacteriostats and solutes which render theformulation isotonic with the blood of the intended recipient; as wellas aqueous and non-aqueous sterile suspensions including suspendingagents and/or thickening agents. The formulations may be presented inunit dose or multi-dose containers, for example as sealed ampoules andvials, and may be stored in a freeze-dried (lyophilized) condition,requiring only the addition of the sterile liquid carrier, for example,saline, water-for-injection, immediately prior to use. Alternatively,the formulations may be presented for continuous infusion.Extemporaneous injection solutions and suspensions may be prepared fromsterile powders, granules and tablets of the kind previously described.

Formulations suitable for rectal administration include suppositorieswith standard carriers such as cocoa butter or polyethylene glycol.Formulations suitable for topical administration in the mouth, forexample, buccally or sublingually, include lozenges, containing theactive ingredient in a flavored base such as sucrose and acacia ortragacanth, and pastilles, comprising the active ingredient in a basesuch as gelatin and glycerin or sucrose and acacia. For intra-nasaladministration, the compounds of the invention may be used as a liquidspray, a dispersible powder, or in the form of drops. Drops may beformulated with an aqueous or non-aqueous base also comprising one ormore dispersing agents, solubilizing agents and/or suspending agents.

For administration by inhalation the compounds can be convenientlydelivered from an insufflator, nebulizer, pressurized packs or otherconvenient means of delivering an aerosol spray. Pressurized packs maycomprise a suitable propellant such as dichlorodifluoromethane,trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide orother suitable gas. In the case of a pressurized aerosol, the dosageunit may be determined by providing a valve to deliver a metered amount.

Alternatively, for administration by inhalation or insufflation, thecompounds may take the form of a dry powder composition, for example apowder mix of the compound and a suitable powder base, such as lactoseor starch. The powder composition may be presented in unit dosage form,for example, as capsules, cartridges, gelatin or blister packs, fromwhich the powder may be administered with the aid of an inhalator orinsufflators.

Other formulations include implantable devices and adhesive patcheswhich release a therapeutic agent.

When desired, the above described formulations, adapted to givesustained release of the active ingredient, may be employed. Thepharmaceutical compositions may also contain other active ingredients,such as antimicrobial agents, immunosuppressants and/or preservatives.

It should be understood that in addition to the ingredients particularlymentioned above, the formulations of this invention may include otheragents conventional in the art with regard to the type of formulation inquestion. For example, formulations suitable for oral administration mayinclude flavoring agents.

Preferred unit dosage formulations contain an effective dose, as recitedbelow, or an appropriate fraction thereof, of the active ingredient.

For each of the aforementioned conditions, the compositions, e.g.,polypeptides and organic compounds, can be administered orally or viainjection at a dose ranging from about 0.1 to about 250 mg/kg per day.The dose range for adult humans is generally from about 5 mg to about17.5 g/day, preferably about 5 mg to about 10 g/day, and most preferablyabout 100 mg to about 3 g/day. Tablets or other unit dosage forms ofpresentation provided in discrete units may conveniently contain anamount which is effective at such dosage or as a multiple of the same,for instance, units containing about 5 mg to about 500 mg, usually fromabout 100 mg to about 500 mg.

The dose employed will depend upon a number of factors, including theage and sex of the subject, the precise disorder being treated, and itsseverity. Also the route of administration may vary depending upon thecondition and its severity. In any event, appropriate and optimumdosages may be routinely calculated by those skilled in the art, takinginto consideration the above-mentioned factors.

Aspects of the present invention are described in the followingexamples, which are not intended to limit the scope of the inventiondescribed in the claims. The following examples illustrate theidentification and characterization of genes differentially expressed inBRC cells.

EXAMPLES

Tissue obtained from diseased tissue (e.g., epithelial cells from BRC)and normal tissues was evaluated to identify genes which are differentlyexpressed or a disease state, e.g., BRC. The assays were carried out asfollows.

Patients and Tissue Samples:

Primary breast cancers were obtained with informed consent from 81patients (12 ductal carcinoma in situ and 69 invasive ductal carcinomafrom 2 cm to 5 cm (T2), median age 45 in a range of 21 to 68 years old)who treated at Department of Breast Surgery, Cancer Institute Hospital,Tokyo, Japan, concerning which all patients had given informed consent(Table 12). Clinical information was obtained from medical records andeach tumor was diagnosed according to histopathological subtype andgrade by pathologists. Tumor tissue was used to evaluate tumor type(according to the World Health Organization classification and theJapanese cancer society classification). Clinical stage was judgedaccording to the JBCS TNM classification. No significant differenceswere observed between node-positive and node-negative cases. Thepresence of angioinvasive growth and extensive lymphocytic infiltratewas determined by pathologists, Estrogen receptor (ER) and progesteronereceptor (PgR) expression was determined by EIA (ER negative when lessthan 13 fmol/mg protein, BML). A mixture of normal breast ductal cellsfrom the 15 premenopausal patients with breast cancer or the 12 postmenopausal patients were used as normal controls, respectively. Allsamples were immediately frozen and stored at −80° C.

Tissue Samples and LMM:

Clinical and pathological information on the tumor is detailed in Table12. Samples were embedded in TissueTek OCT medium (Sakura) and thenstored at −80° C. until use. Frozen specimens were serially sectioned in8-μm slices with a cryostat and stained with hematoxylin and eosin todefine the analyzed regions. To avoid cross-contamination of cancer andnoncancerous cells, these two populations were prepared by EZ Cut LMMSystem (SL Microtest GmbH) followed the manufacture's protocol withseveral modifications. To minimize the effects during storage processand tissue collection, the cancer tissues were carefully handled by thesame procedure. To check the quality of RNAs, total RNA extracted fromthe residual tissue of each case were electrophoresed under thedegenerative agarose gel, and confirmed their quality by a presence ofribosomal RNA bands.

RNA Extraction and T7-Based RNA Amplification:

Total RNA was extracted from each population of laser captured cellsinto 350 μl RLT lysis buffer (QIAGEN). The extracted RNA was treated for30 minutes at room temperature with 30 units of DNase I (QIAGEN). Afterinactivation at 70° C. for 10 min, the RNAs were purified with an RNeasyMini Kit (QIAGEN) according to the manufacturer's recommendations. Allof the DNase I treated RNA was subjected to T7-based amplification usingAmpliscribe T7 Transcription Kit (Epicentre Technologies). Two rounds ofamplification yielded 28.8-329.4 μg of amplified RNAs (aRNAs) for eachsample, whereas when RNAs from normal samples from 15 premenopausalpatients or 12 postmenopausal patients were amplified, total of 2240.2μg and 2023.8 μg were yielded, respectively. 2.5 μg aliquots of aRNAfrom each cancerous cells and noncancerous breast ductal cells werereverse-transcribed in the presence of Cy5-dCTP and Cy3-dCTP (AmershamBiosciences), respectively.

cDNA Microarrays:

A “genome-wide” cDNA microarray system was established containing 23,040cDNAs selected from the UniGene database (build #131) the NationalCenter for Biotechnology Information (NCBI). Fabrication of cDNAmicroarray slides has been described elsewhere (Ono K, Tanaka T, TsunodaT, Kitahara O, Kihara C, Okamoto A, Ochiai K, Katagiri T and Nakamura Y.Identification by cDNA Microarray of Genes Involved in OvarianCarcinogenesis. Cancer Res., 60, 5007-11, 2000). Briefly, the cDNAs wereamplified by reverse transcription-PCR using poly(A)+RNA isolated fromvarious human organs as templates; lengths of the amplicons ranged from200 to 1100 bp without repetitive or poly(A) sequences. The PCR productswere spotted in duplicate on type-7 glass slides (Amersham Bioscience)using a Lucidea Array Spotter (Amersham Biosciences); 4,608 or 9,216genes were spotted in duplicate on a single slide. Three different setsof slides (total 23,040 genes) were prepared, each of which were spottedwith the same 52 housekeeping genes and two kinds of negative-controlgenes as well.

Hybridization and Acquisition of Data:

Hybridization and washing were performed according to protocolsdescribed previously except that all processes were carried out with anAutomated Slide Processor (Amersham Biosciences) (Giuliani, N., et al.,V. Human myeloma cells stimulate the receptor activator of nuclearfactor-kappa B ligand (RANKL) in T lymphocytes: a potential role inmultiple myeloma bone disease. Blood, 100: 4615-4621, 2002). Theintensity of each hybridization signal was calculated photometrically bythe ArrayVision computer program (Amersham Biosciences) and backgroundintensity was subtracted. The fluorescence intensities of Cy5 (tumor)and Cy3 (control) for each target spot were adjusted so the mean Cy5/Cy3ratio was performed using averaged signals from the 52 housekeepinggenes. Because data derived from low signal intensities are lessreliable, a cut-off value for signal intensities on each slide wasdetermined and excluded genes from further analysis when both Cy3 andCy5 dyes gave signal intensities lower than the cut-off. A cut-off valuefor each expression level was automatically calculated according tobackground fluctuation. When both Cy5 and Cy3 signal intensities werelower than the cut-off values, expression of the corresponding gene inthat sample was assessed as absent. The Cy5/Cy3 ratio was calculated asthe relative expression ratio. For other genes, the Cy5/Cy3 ratio wascalculated using the raw data for each sample.

Signal intensities of Cy3 and Cy5 from the 23,040 spots were quantifiedand analyzed by substituting backgrounds, using ArrayVision software(Imaging Research, Inc., St. Catharines, Ontario, Canada). Subsequentlythe fluorescent intensities of Cy5 (tumor) and Cy3 (control) for eachtarget spot were adjusted so that the mean Cy3/Cy5 ratio of 52housekeeping genes on the array was equal to one. Because data derivedfrom low signal intensities are less reliable, a cut-off value on eachslide was determined as described previously (Ono, K., et al.,Identification by cDNA microarray of genes involved in ovariancarcinogenesis. Cancer Res, 60: 5007-5011, 2000.) and those genes wereexcluded from further analysis when both Cy3 and Cy5 dyes yielded signalintensities lower than the cut-off (Saito-Hisaminato, A., Katagiri, T.,Kakiuchi, S., Nakamura, T., Tsunoda, T., and Nakamura, Y. Genome-wideprofiling of gene expression in 29 normal human tissues with a cDNAmicroarray. DNA Res, 9: 35-45, 2002). For other genes, the Cy5/Cy3 ratiowas calculated using the raw data for each sample.

Calculation of Contamination Percentage:

Perilipin (PLIN) and fatty acid binding protein 4 (FABP4) were expressedexclusively in adipose tissue and mammaly gland tissue by geneexpression profiles in 29 normal human tissues with a cDNA microarray(Saito-Hisaminato, A. et al., Genome-wide profiling of gene expressionin 29 normal human tissues with a cDNA microarray. DNA Res, 9: 35-45,2002). These were used to evaluate the proportion of adipocytes presentin the population of microdissected normal breast ductal epithelialcells. Each aRNA of poly A ⁺RNA isolated from normal whole-mammary gland(Clontech) and of microdissected normal breast ductal epithelial cellswere reverse-transcribed in the presence of Cy5-dCTP and Cy3-dCTP,respectively. After hybridization on microarray slides, the Cy5/Cy3ratio was calculated. The average of each ratio was decided by theresult used mammary gland tissue and microdissected normal breast ductalcells in premenopausal patients and postmenopausal patients.

Cluster Analysis of 102 Samples with 81 Breast Carcinoma According toGene-Expression Profiles:

An unsupervised hierarchical clustering method was applied to both genesand tumors. To obtain reproducible clusters for classification of the102 samples, 710 genes for which valid data were obtained in 80% of theexperiments, and whose expression ratios varied by standard deviationsof more than 1.1, were selected. The analysis was performed usingweb-available software (“Cluster” and “TreeView”) written by M. Eisen(at genome-www5.stanford.edu/MicroArray/SMD/restech.html). Beforeapplying the clustering algorithm, the fluorescence ratio for each spotwas log-transformed and then median-centered the data for each sample toremove experimental biases and used average linkage.

Identification of Up or Down-Regulated Genes Between DCIS and IDC:

The relative expression ratio of each gene (Cy5/Cy3 intensity ratio) wasclassified into one of four categories: (A) up-regulated (expressionratio>2.0); (B) down-regulated (expression ratio<0.5); (C) unchanged(expression ratio between 0.5 and 2.0); and (D) not expressed (or slightexpression but under the cutoff level for detection). These categorieswere used to detect a set of genes for which changes in the expressionratios were common among samples. To detect candidate genes that werecommonly up- or down-regulated in each group, the overall expressionpatterns of 23,040 genes were first screened to select genes withexpression ratios>3.0 or <1/3 that were present in >50% of the groupscategorized.

Semi-Quantitative RT-PCR:

Five up-regulated genes were selected and their expression levels wereexamined by applying the semi-quantitative RT-PCR experiments. A 1-μgaliquot of aRNA from each sample was reverse-transcribed forsingle-stranded cDNAs using random primer (Taniguchi, K., et al.,Mutational spectrum of beta-catenin, AXIN1, and AXIN2 in hepatocellularcarcinomas and hepatoblastomas. Oncogene, 21: 4863-4871, 2002.) andSuperscript II (Life Technologies, Inc.). Each cDNA mixture was dilutedfor subsequent PCR amplification with the primer sets that were shown inTable 9. Expression of GAPDH served as an internal control. PCRreactions were optimized for the number of cycles to ensure productintensity within the linear phase of amplification.

Identification of Genes Responsible for Histopathological Status, ErStatus and Lymph-Node Metastasis in Breast Cancer:

The discriminating genes were selected using the following two criteria:(1) signal intensities higher than the cut-off level in at least 70% (ERstatus) or 50% (Histopathological status and lymph-node metastasis) ofthe cases; (2) |Med₁-Med_(n)|>1 (ER status) or 0.5 (Histopathologicalstatus and lymph-node metastasis) of the cases, where Med indicates themedian derived from log-transformed relative expression ratios innode-positive cases or -negative cases. Next, a random permutation testwas applied to identify genes that were expressed differently betweenone group (group A) and another (group B). Mean (μ) and standard (σ)deviations were calculated from the log-transformed relative expressionratios of each gene in group A (r) and group B (n) cases. Adiscrimination score (DS) for each gene was defined as follows:

DS=(μ_(r)−μ_(n))/(σ_(r)+σ_(n))

Permutation tests were carried out to estimate the ability of individualgenes to distinguish between group A and group B; samples were randomlypermutated between the two classes 10,000 times. Since the DS dataset ofeach gene showed a normal distribution, a P value was calculated for theuser-defined grouping (Golub, T. et al., Molecular classification ofcancer: class discovery and class prediction by gene expressionmonitoring. Science, 286: 531-537, 1999).

Calculation of Prediction Score for Lymph-Node Metastasis:

Prediction scores were calculated according to procedures describedpreviously (Golub, T. et al., Molecular classification of cancer: classdiscovery and class prediction by gene expression monitoring. Science,286: 531-537, 1999). Each gene (gi) votes for either lymph node-negativeor lymph node-positive depending on whether the expression level (xi) inthe sample is closer to the mean expression level of node-negative or-positive in reference samples. The magnitude of the vote (vi) reflectsthe deviation of the expression level in the sample from the average ofthe two classes:

V _(i) =|x _(i)−(μ_(r)+μ_(i))/2|

The votes were summed to obtain total votes for the node-negative (Vr)and node-positive (Vn), and calculated PS values as follows:

PS=(Vr−Vn)/(Vr+Vn)×100, reflecting the margin of victory in thedirection of either node-negative or node-positive. PS values range from−100 to 100; a higher absolute value of PS reflects a strongerprediction.

Evaluation of Classification and Leave-One-Out Test:

The classification score (CS) was calculated the using prediction scoresof lymph node-negatives (PSr) and node-positives (PSn) in each gene set,as follows:

CS=(μ_(PSr)−μ_(PSn))/(σ_(PSr)+σ_(PSn))

A larger value of CS indicates better separation of the two groups bythe predictive-scoring system. For the leave-one-out test, one sample iswithheld, the permutation p-value and mean expression levels arecalculated using remaining samples, and the class of the withheld sampleis subsequently evaluated by calculating its prediction score. Thisprocedure was repeated for each of the 20 samples.

Cell Lines

Human-breast cancer cell lines HBL-100, HCC1937, MCF-7, MDA-MB-435s,YMB1, SKBR3, T47D, BT-20, BT-474, BT-549, HCC1143, HCC1500, HCC1599,MDA-MB-157, MDA-MB453, OUCB-F, ZR-75-1, COS-7 cell lines are purchasedfrom American Type Culture Collection (ATCC) and are cultured undertheir respective depositors' recommendation. HBC4, HBC5 and MDA-MB-231cells lines are kind gifts from Dr. Yamori of Molecular Pharmacology,Cancer Chemotherapy Centre of the Japanese Foundation for CancerResearch. All cells were cultured in appropriate media; i.e. RPMI-1640(Sigma, St. Louis, Mo.) for HBC4, HBC5, T47D, YMB1, OUCB-F, ZR-75-1,BT-549, HCC1143, HCC1500, HCC1599 and HCC1937 (with 2 mM L-glutamine);Dulbecco's modified Eagle's medium (Invitrogen, Carlsbad, Calif.) forBT474, HBL100, COS7; EMEM (Sigma) with 0.1 mM essential amino acid(Roche), 1 mM sodium pyruvate (Roche), 0.01 mg/ml Insulin (Sigma) forBT-20 and MCF-7; McCoy (Sigma) for SKBR3 (with 1.5 mM L-glutamine); L-15(Roche) for MDA-MB-231, MDA-MB-157, MDA-MB453 and MDA-MB-4355. Eachmedium was supplemented with 10% fetal bovine serum (Cansera) and 1%antibiotic/antimycotic solution (Sigma). MDA-MB-231 and MDA-MB-4355cells were maintained at 37° C. an atmosphere of humidified air withoutCO₂. Other cell lines were maintained at 37° C. an atmosphere ofhumidified air with 5% CO₂. Clinical samples (breast cancer and normalbreast duct) were obtained from surgical specimens, concerning which allpatients had given informed consent.

Northern-Blot Analysis

Total RNAs were extracted from all breast cancer cell lines using RNeasykit (QIAGEN) according to the manufacturer's instructions. Aftertreatment with DNase I (Nippon Gene, Osaka, Japan), mRNA was isolatedwith mRNA purification kit (Amersham Biosciences) following themanufacturer's instructions. A 1-μg aliquot of each mRNA, along withpolyA(+) RNAs isolated from normal adult human breast (Biochain), lung,heart, liver, kidney, bone marrow (BD, Clontech, Palo Alto, Calif.),were separated on 1% denaturing agarose gels and transferred to nylonmembranes (Breast cancer-Northern blots). Breast cancer- and Humanmultiple-tissue Northern blots (Clontech, Palo Alto, Calif.) werehybridized with an [α³²P]-dCTP-labeled PCR products of A7870 prepared byRT-PCR (see below). Pre-hybridization, hybridization and washing wereperformed according to the supplier's recommendations. The blots wereautoradiographed with intensifying screens at −80° C. for 14 days.Specific probes for A7870 (320 bp) was prepared by RT-PCR using thefollowing primer set; 5′-AGACCCTAAAGATCGTCCTTCTG-3′ (SEQ ID NO:13) and5′-GTGTTTTAAGTCAGCATGAGCAG-3′ (SEQ ID NO:14) and is radioactivelylabeled with megaprime DNA labeling system (Amersham bioscience).

Immunocytochemical Staining

For constructing of A7870 expression vectors, the entire coding sequenceof A7870 cDNA was amplified by the PCR using KOD-Plus DNA polymerase(Toyobo, Osaka, Japan). The PCR products were inserted into the EocRIand Xho I sites of pCAGGSn3FH-HA expression vector. This construct(pCAGGS-A7870-HA) was confirmed by DNA sequencing. Next, to initiallyexamine the sub-cellular localization of exogenous A7870, we seeded COS7cells at 1×10⁵ per well for exogenous expression. After 24 hours, wetransiently transfected with 1 μg of pCAGGS-A7870-HA into COS7 cellsusing FuGENE 6 transfection reagent (Roche) according to themanufacturer's instructions, respectively. Then, cells were fixed withPBS containing 4% paraformaldehyde for 15 min, and rendered permeablewith PBS containing 0.1% Triton X-100 for 2.5 min at 4° C. Subsequentlythe cells were covered with 3% BSA in PBS for 12 hours at 4° C. to blocknon-specific hybridization. Next, A7870-HA-transfected COS7 cells wereincubated with a mouse anti-HA antibody (SANTA CRUZ) at 1:1000 dilutionand anti-TOPK polyclonal antibody (Cell Signaling) at 1:1000 dilution.After washing with PBS, both transfected-cells were stained by anAlexa594-conjugated anti-mouse secondary antibody (Molecular Probe) at1:5000 dilution.

We further confirmed the sub-cellular localization of endogenous A7870protein in breast cancer cell lines, T47D, BT-20 and HBC5 at 2×10⁵ cellsper well. Cells were with a rabbit anti-TOPK polyclonal antibody made ofsynthetic peptide corresponding to amino acids at the c-terminus ofhuman PBK/TOPK at 1:1000 dilution. After washing with PBS, the cellswere stained by an Alexa488-conjugated anti-rabbit secondary antibody(Molecular Probe) at 1:3000 dilution. Nuclei were counter-stained with4′,6′-diamidine-2′-phenylindole dihydrochloride (DAPI). Fluorescentimages were obtained under a TCS SP2 AOBS microscope (Leica, Tokyo,Japan).

Construction of A7870 Specific-siRNA Expression Vector Using psiU6BX3.0

We established a vector-based RNAi system using psiU6BX3.0 siRNAexpression vector according to the previous report (Shimokawa T,Furukawa Y, Sakai M, Li M, Miwa N, Lin Y M, Nakamura Y (2003). CancerRes, 63, 6116-6120). A siRNA expression vector against A7870(psiU6BX-A7870) was prepared by cloning of double-strandedoligonucleotides in Table 13 into the BbsI site in the psiH1BX3.0vector. Control plasmids, psiU6BX-SC and psiU6BX-LUC was prepared bycloning double-stranded oligonucleotides of5′-TCCCGCGCGCTTTGTAGGATTCGTTCAAGAGACGAATCCTACAAAGCGCGC-3′(SEQ ID NO:15)and 5′-AAAAGCGCGCTTTGTAGGATTCGTCTCTTGAACGAATCCTACAAAGCGCGC-3′(SEQ IDNO:16) for SC (scrambled control);5′-TCCCCGTACGCGGAATACTTCGATTCAAGAGATCGAAGTATTCCGCGTACG-3′(SEQ ID NO:17)and 5′-AAAACGTACGCGGAATACTTCGATCTCTTGAATCGAAGTATTCCGCGTACG-3′ (SEQ IDNO:18) for LUC (luciferase control) into the BbsI site in the psiU6BX3.0vector, respectively.

Gene-Silencing Effect of A7870

Human breast cancer cells lines, T47D or BT-20 was plated onto 15-cmdishes (4×10⁶ cells/dish) and transfected with 16 μg of each psiU6BX-LUC(luciferase control), psiU6BX-SC (scrambled control) as negativecontrols and psiU6BX-A7870 using FuGENE6 reagent according to thesupplier's recommendations (Roche). 24 hour after transfection, cellsare re-seeded again for colony formation assay (2×10⁶ cells/10 cm dish),RT-PCR (2×10⁶ cells/10 cm dish) and MTT assay (2×10⁶ cells/well). Weselected the A7870-introducing cells with medium containing 0.7 mg/ml or0.6 mg/ml of neomycin (Geneticin, Gibco) in T47D or BT-20 cells,respectively. Afterward, we changed medium every two days for 3 weeks.To evaluate the functioning of siRNA, total RNA was extracted from thecells at 11 days after neomycin selection, and then the knockdown effectof siRNAs was confirmed by a semi-quantitative RT-PCR using specificprimer sets for A7870 and GAPDH; 5′-ATGGAAATCCCATCACCATCT -3′ (SEQ IDNO:19) and 5′-GGTTGAGCACAGGGTACTTTATT -3′ (SEQ ID NO:20) for GAPDH as aninternal control, and 5′-GCCTTCATCATCCAAACATT-3′ (SEQ ID NO:21) and5′-GGCAAATATGTCTGCCTTGT-3′ (SEQ ID NO:22) for A7870.

Moreover, transfectants expressing siRNAs using T47D or BT-20 cell lineswere grown for 23 days in selective media containing neomycin,respectively. After fixation with 4% paraformaldehyde, transfected cellswere stained with Giemsa solution to assess colony formation. MTT assayswere performed to quantify cell viability. After 10 days of culture inthe neomycin-containing medium, MTT solution(3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide) (Sigma)was added at a concentration of 0.5 mg/ml. Following incubation at 37°C. for 2.5 hours, acid-SDS (0.01N HCl/10% SDS) was added; the suspensionwas mixed vigorously and then incubated overnight at 37° C. to dissolvethe dark blue crystals. Absorbance at 570 nm was measured with aMicroplate Reader 550 (BioRad). To evaluate the functioning of siRNA,total RNA is extracted from cells 7 days after selection, MTT assay isperformed at 10 days after selection using Cell Counting Kit-8 (Dojindo)according to manufacture's protocol. Absorbance is measured at 570 nmwavelength with a Microplate Reader 550 (BioRad). For colony formationassay, cells are fixed with 4% paraformaldehyde for 15 min beforestaining with Giemsa's solution (Merck). Each experiment is triplicated.

Results Classification Analysis on the Basis of Precise Gene ExpressionProfiles of Breast Cancer:

Since breast cancer contains a low population of cancer cells in tumormass and originates from normal epithelial duct cells, microdissectionwas carried out to avoid contamination of the surrounding non-cancerouscells or non-normal ductal epithelial cells. As the great majority ofcells in breast tissue are adipocytes, it was considered to not besuitable to use the whole breast tissue to analyze cancer-specificexpression profiles in that organ. As shown in FIG. 1, therepresentative examples of DCIS (case 10326T), IDC (10502T), and normalductal epithelium (10341N) were microdissected from each clinicalspecimen. This allows the subsequent gene expression profiles to beobtained more precisely. The proportion of adipocytes that contaminatedthe microdissected population of normal breast ductal epithelial cellsserving as a universal control were examined by measuring the signalintensities of two genes (i.e., PLIN and FABP4) that are highlyexpressed in adipose and mammary gland tissues as described previously(Saito-Hisaminato, A., et al., Genome-wide profiling of gene expressionin 29 normal human tissues with a cDNA microarray. DNA Res, 9: 35-45,2002). When the signal intensities of these genes were investigated inwhole mammary gland tissue, which contains a large number of adipocytes,the average of ratio of signal intensities of these gene wereapproximately 99.4%; the ratio in microdissected normal breast ductalepithelial cells was approximately 0.6% (see Contamination percentagesection in Materials and Methods). Therefore, it was estimated that theaverage proportion of contaminating adipocytes in the populations ofcontrol cells to be 0.6% after microdissection.

First, an unsupervised two-dimensional hierarchical clustering algorithmwas applied to group genes on the basis of similarity in theirexpression pattern over 102 clinical samples: 81 microdissecteddifferent clinical breast cancer specimens, 11 microdissected differenthistological types in 10 individuals, 2 whole breast cancer tissues, 6microdissected normal breast ductal cells and two whole mammary glandtissues. Reproducible clusters were obtained with 710 genes (seeMaterial and methods); their expression patterns across the 102 samplesare shown in FIG. 2A. In the sample axis, the 102 samples were clusteredinto three major groups (Group A, B and C) on the basis of theirexpression profiles. Then, this classification was associated withclinical parameters, especially estrogen receptor (ER) as determinedwith EIA. Out of 55 ER-positive tumors, 45 cases clustered into samebranch (Group B) of the tumor dendrogram, suggesting a tendency with ERstatus. Moreover, 7 of 10 cases with different histological type(sample#10864, 10149, 10818, 10138, 10005, 10646 and 10435) were labeledand hybridized in independent experiments were clustered most closelywithin same group. In particular, among them, the one duplicated case(10149a1 and 10149a1T) was also clustered into the shortest branch,supporting the reproducibility and reliability of the microarray data.Remarkably, Group C contained microdissected non-cancerous cells andbreast cancer whole tissues, with the exception of one microdissectedtumor case, suggesting this data represents accurate breast cancerspecific-expression profiles.

Furthermore, a two-dimensional hierarchical clustering analysis of 89genes was performed across 16 samples with 2 differentiated lesionsmicrodissected from 8 breast cancer patients. As a result, breast cancersamples with different phenotype lesions were closely adjacent (FIG.2B). Next, a random permutation test was carried out to identify thegenes that were differentially expressed in the patient-matchedphenotypically well- or poorly-differentiated lesions frommicrodissected 8 cancer specimen. As shown in FIG. 2C, clusteringanalysis using 25 genes that showed differential expression can separatebetween well- or poorly-differentiated invasive ductal cancer cells.These 25 genes (Table 1) included some key factors whose possible rolesin invasion and cell growth had been reported previously: TNFSF11, ITGA5and NFAT5 (Giuliani, N., et al., Human myeloma cells stimulate thereceptor activator of nuclear factor-kappa B ligand (RANKL) in Tlymphocytes: a potential role in multiple myeloma bone disease. Blood,100: 4615-4621, 2002; Sebastien J. et al., The role of NFATtranscription factors in integrin-mediated carcinoma invasion. Naturecell biology, 4: 540-544, 2002; Klein, S. et al., Alpha 5 beta 1integrin activates an NF-kappa B-dependent program of gene expressionimportant for angiogenesis and inflammation. Mol Cell Biol, 22:5912-5922, 2002).

Next, a random permutation test was carried out to identify the genesthat were differentially expressed in 41 ER-positive tumor and 28ER-negative tumors in IDC. These all samples were from premenopausalpatients. 97 genes that were able to distinguish between ER positive andnegative with permutation P-value of less than 0.0001 were listed (see“Materials and Methods”) (FIG. 3 and Table 2). Among them 96 genes wereselected as BRC related genes of the present invention. Expressionlevels were increased for 92 of those genes and decreased for the otherfive in ER-positive group, as compared to the ER-negative group. Amongthese genes, GATA binding protein 3 (GATA3), trefoil factor 3 (TFF3),cyclin D1 (CCND1), MAPKK homolog (MAP2K4) and tissue inhibitor ofmetalloprotease 1 (TIMP1), insulin receptor substrate 1(IRS1), X-boxbinding protein 1(XBP1), GLI-Kruppel family memberGLI3(GLI3) wereover-expressed in the ER-positives (Table 2). In addition, sinceestrogen receptor (ESR1) was rank-ordered at 6^(th) gene on the basis ofmagnitude of p-value (bottom panel in FIG. 3), it may be possible todistinguish breast cancers according to expression profiles of ER.

Identification of Commonly Up- or Down-Regulated Genes in DCIS or IDC:

To further clarify mechanisms underlying carcinogenesis of breastcancer, genes commonly up- or down-regulated in DCIS and IDC wereinvestigated, respectively. Gene expression profiles in 77 breast tumors(8 DCIS and 69 IDC premenopausal patients) identified 325 genes withcommonly altered expression (FIG. 4A, 4B); 78 genes that were commonlyup-regulated more than three-fold over their levels in normal breastductal cells (FIG. 4A, 4B, Table 3, 5), whereas 247 genes whoseexpression were reduced to less than ⅓ in breast cancer cells (FIG. 4A,4B, Table 4, 6). In particular, as shown in FIG. 4B, expression level of25 genes was increased and that of 49 genes was decreased in transitionfrom DCIS to IDC (Table 5 and 6). Among genes with elevated expression,fibronectin (FN1) which had already been reported as over-expressed inbreast cancers (Mackay, A. et al., cDNA microarray analysis of genesassociated with ERBB2 (HER2/neu) overexpression in human mammary luminalepithelial cells. Oncogene, 22: 2680-2688, 2003; Lalani, E. N. et al.,Expression of the gene coding for a human mucin in mouse mammary tumorcells can affect their tumorigenicity. J Biol Chem, 266: 15420-15426,1991; 22. Martin-Lluesma, S., et al., A. Role of Hec1 in spindlecheckpoint signaling and kinetochore recruitment of Mad1/Mad2. Science,297: 2267-2270, 2002) was included (Table 4). On the other hand, amonggenes with decreased expression, ST5 and SCHIP1 which were known tofunction as tumor suppressor were also included (Table 6).

Next, genes with specifically altered expression exclusively in IDC wereinvestigated. As a result, 24 up-regulated genes (FIG. 4C, Table 7) and41 down-regulated genes (FIG. 4C, Table 8) were identified. Of theup-regulated genes, ERBB2, CCNB1, BUB1B were already known to beinvolved in carcinogenesis of breast cancers (Latta, E. K., et al., Therole of HER2/neu overexpression/amplification in the progression ofductal carcinoma in situ to invasive carcinoma of the breast. ModPathol, 15: 1318-1325, 2002; Takeno, S., et al., Prognostic value ofcyclin B1 in patients with esophageal squamous cell carcinoma. Cancer,94: 2874-2881, 2002; Slamon, D. J., et al., Human breast cancer:correlation of relapse and survival with amplification of the HER-2/neuoncogene. Science, 235: 177-182, 1987). Of the down-regulated genes,AXUD1, a gene induced by AXIN, which was frequently down-regulated inlung, liver, colon and kidney cancers (Ishiguro, H., et al.,Identification of AXUD1, a novel human gene induced by AXIN1 and itsreduced expression in human carcinomas of the lung, liver, colon andkidney. Oncogene, 20: 5062-5066, 2001) was included, suggesting thatAXUD1 may also be involved in breast cancer carcinogenesis.

Verification of Selected Genes by Semi-Quantitative RT-PCR:

To confirm the reliability of the expression data obtained by cDNAmicroarray analysis, semi-quantitative RT-PCR experiments were performedfor 3 genes (Accession No. AI261804, AA205444, AA167194) that werehighly up-regulated in informative cases with well-differentiated type,and 2 genes (AA676987 and H22566) that were also highly up-regulated ininformative cases with poorly-differentiated type. The RT-PCR resultswere highly concordant with those of the microarray analysis in thegreat majority of the tested cases (FIG. 5, Table 9).

Identification of A7870, Designed T-LAK Cell Originated Protein Kinase,as an Up-Regulated Gene in Breast Cancer Cells

We identified 24 genes that were up-regulated in IDC (Table 7). Amongthem, we focused on A7870, designed to T-LAK cell originated proteinkinase, TOPK (Genbank Accession, NM_(—018492)) is located at chromosome8p21.2 with a mRNA transcript 1899 bases in length consisting of 8exons. Expression of A7870 was elevated in 30 of 39 (77%) breast cancercases which were able to obtain expression data, especially in 29 of 36(81%) cases with invasive ductal carcinoma specimens. To confirm theexpression pattern of this gene in breast cancers, we performedsemi-quantitative RT-PCR analysis using breast cancer cell lines andnormal human tissues including normal breast cells. As a result, wefound that A7870 whose expression showed the elevated expression in 7 of12 clinical breast cancer specimens (well-differentiated type) comparedto normal breast ductal cells and other normal tissues (FIG. 6 a), andwas overexpressed in 17 of 20 breast cancer cell lines (FIG. 6 b). Tofurther examine the expression pattern of this gene, we performedNorthern blot analyses with multiple-human tissues and breast cancercell lines using a cDNA fragment (320 bp) of A7870 as a probe (FIG. 7a). As a result, we observed that two transcripts (approximately 1.9 kband 1.8 kb) were exclusively expressed in normal human testis andthymus. When we further examined the expression pattern of thesetranscripts with breast cancer-northern blot, we found that bothtranscripts were specifically overexpressed in breast cancer cell lines,compared to normal human tissues (FIG. 7 b).

Isolation of Breast Cancer Specific-Expressed Transcript of A7870.

Through the sequencing analysis of two transcript of A7870, since twovariants of A7870 contain same open reading frame (ORF), we focused onTOPK, (Genbank accession number NM_(—018492)), encodes a protein whichis a serine/threonine kinase related to the dual specificmitogen-activated protein kinase kinase (MAPKK) family. SMART computerprediction shows TOPK contains pfam, pkinase motif in 32 to 320residues, suggesting that this protein might involved in a signaltransduction pathway that play a role in cell morphogenesis and cellgrowth.

Subcellular Localization of A7870

To further examine the characterization of A7870, we examined thesub-cellular localization of these gene products in mammalian cells.Firstly, when we transiently transfected plasmids expressing A7870protein (pCAGGS-A7870-HA) into COS7 cells, immunocytochemical analysiswith anti-HAtag antibody and TOPK polyclonal antibody reveals thatexogenous A7870 protein localized to the cytoplasm and especially,strong signal around the nucleus membrane in all transfected-COS7 cells(FIG. 8 a). Moreover, we examined the sub-cellular localization ofendogenous protein with immunocytochemical staining using an anti-TOPKpolyclonal antibody. Similarly, A7870 protein was also observed to becytoplasmic apparatus and around nucleus in T47D, BT-20 and HBC5 cells(FIG. 8 b).

Growth-Inhibitory Effects of Small-Interfering RNA (siRNA) Designed toReduce Expression of A7870

To assess the growth-promoting role of A7870, we knocked down theexpression of endogenous

A7870 in breast cancer line T47D and BT-20, that have shown theoverexpression of A7870, by means of the mammalian vector-based RNAinterference (RNAi) technique (see Materials and Methods). We examinedexpression levels of A7870 by semi-quantitative RT-PCR experiments.A7870 (si1, si3 and si4)-specific siRNAs significantly suppressedexpression, compared with control siRNA constructs (psiU6BX-LUC or -SC).To confirm the cell growth inhibition with A7870-specific siRNAs, weperformed colony-formation and MTT assays, respectively. As a result,introduction of A7870 siRNA constructs suppressed growth of these breastcancer cells, consisting with the result of above reduced expression ofthis gene. Each result was verified by three independent experiments.Thus, our findings suggest that A7870 has a significant function in thecell growth of the breast cancer.

Identification of Genes with Differentially Expressed inHistopathological Types, and Phenotypical Difference in IndividualPatients:

One goal of the present invention was to discover consistently up- ordown-regulated genes at different phenotype in some patients. However,since breast cancer shows heterogeneous and various phenotypes,histopathological differentiation by microscopy was not clearlydiscerned using unsupervised classification by gene expression patternsas shown in FIG. 2. To examine this observation more closely, arandom-permutation test was performed and 206 genes that can distinguishbetween well-differentiated and poorly-differentiated cases wereextracted. These 206 discriminating genes were all significant at thelevel of P<0.01 between 31 well- and 24 poorly-differentiated cancers(FIG. 9, Table 10). Two-dimensional hierarchical clustering analysisusing these 206 genes was also able to classify the groups with regardto the distinct components of IDC (well-differentiated,moderately-differentiated and poorly-differentiated). Group A clustercontained genes with markedly increased expression inpoorly-differentiated samples (branch 1 in the horizontal row);extracellular matrix structure (COL1A2, COL3A1 and P4HA2), cell adhesion(LOXL2, THBS2 and TAGLN2), whereas group B cluster contained the geneswith increased expression primarily in well-differentiated andmoderately-differentiated samples (branch 2 in the horizontal row);regulation of transcription (BTF, WTAP, HTATSF1), cell cycle regulator(CDC5L, CCT7). Two poorly-differentiated samples (sample #10709 and10781) in group B, however, showed an expression pattern that wassimilar to well-differentiated signature rather thanpoorly-differentiated types. Some well-differentiated samplesdemonstrated co-expression of some genes that are characteristic of thepoorly-differentiated signature.

Development of Predictive Scores for Lymph Node Metastasis:

In breast cancer, invasion into axillary lymph nodes is the mostimportant prognostic factor (Shek, L. L. and Godolphin, W. Model forbreast cancer survival: relative prognostic roles of axillary nodalstatus, TNM stage, estrogen receptor concentration, and tumor necrosis.Cancer Res, 48: 5565-5569, 1988). To develop an equation to achieve ascoring parameter for the prediction of axially lymph node metastasisusing expression profiles of selected genes, the expression profiles of20 node-positive cases and 20 node-negative cases were compared.Following the criteria described above, the 93 discriminating genes thatshowed permutation p-values of less than 0.0001 were first selected.Then, the top 34 genes in the candidate list that showed the bestseparation of node-positive from -negative cases were obtained (Table11). As shown in FIG. 10A, a hierarchical clustering analysis usingthese 34 genes clearly classified all 40 breast cancer cases into one oftwo groups according to lymph-node status.

Finally, a predictive-scoring system that could clearly distinguishnode-positive cases from node-negative cases using the expressionprofiles of the set of 34 genes was constructed. To further validatethis scoring system, scores for 20 node-positive cases and 20 lymphnode-negative cases that had not been among those used for constructionof the scoring system, were calculated (see “Materials and Methods”).When 15.8 as a borderline score for 40 patients belonging topositive-metastasis group and negative were clearly separated (FIG. 10B)and scores of over 15.8 as “positive”, and those of 15.8 or lower as“negative”. To clarify the system further, the prediction score ofmetastasis from primary tumors, 17 node-positive cases and 20 negativecases who had not been part of the original procedure for selectingdiscrimination genes, were calculated. As shown in FIGS. 10B and 10C,among the 17 cases with lymph-node metastasis, all cases had positivescores according to the definition herein, whereas 18 (90%) of the 20cases without lymph-node metastasis showed negative scores. 75 (97%)cases of 77 were placed correctly according to their lymph-node status,but two node-negative cases were misplaced or placed to the borderlineor positive region.

Discussion

Breast cancer is a multifactor disease that develops as a result ofinteractions among genetic, environmental, and hormonal factors.Although distinct pathological stages of breast cancer have beendescribed, the molecular differences among these stages are largelyunknown (McGuire, W. L. Breast cancer prognostic factors: evaluationguidelines. J Natl Cancer Inst, 83: 154-155, 1991; Eifel, P., et al.,National Institutes of Health Consensus Development ConferenceStatement: adjuvant therapy for breast cancer, Nov. 1-3, 2000. J NatlCancer Inst, 93: 979-989, 2001; Fisher, B., et al., Twenty-yearfollow-up of a randomized trial comparing total mastectomy, lumpectomy,and lumpectomy plus irradiation for the treatment of invasive breastcancer. N Engl J Med, 347: 1233-1241, 2002).

The development of genome-wide analysis of gene expression and lasermicrobeam microdissection (LMM) isolating pure cancerous cellpopulations of breast cancer enable the search for molecular-targetgenes having cancer-specific classification, treatment and outcomeprediction in a variety of tumor types, especially in breast cancer.

Since, adipocytes account for more than 90% of mammary gland tissue, andepithelial cells in the organ, from which the carcinoma originates,correspond to a very small percentage, an analysis of gene-expressionprofiles using whole cancer tissues and normal whole mammary gland issignificantly influenced by the particular mixture of cells in thetissues examined; proportional differences of adipocytes, fibroblasts,and inflammatory cells can mask significantly specific-expression ofgenes involved in breast carcinogenesis. Hence, an LMM system was usedto purify as much as possible the populations of cancerous cells andnormal epithelial cells obtained from surgical specimens (Hasegawa, S.,et al. Genome-wide analysis of gene expression in intestinal-typegastric cancers using a complementary DNA microarray representing 23,040genes. Cancer Res, 62: 7012-7017, 2002; Kitahara, et al., and Tsunoda,T. Alterations of gene expression during colorectal carcinogenesisrevealed by cDNA microarrays after laser-capture microdissection oftumor tissues and normal epithelia. Cancer Res, 61: 3544-3549, 2001;Kikuchi, T., et al. Expression profiles of non-small cell lung cancerson cDNA microarrays: identification of genes for prediction oflymph-node metastasis and sensitivity to anti-cancer drugs. Oncogene,22: 2192-2205, 2003; Gjerdrum, L. M., et al., Laser-assistedmicrodissection of membrane-mounted paraffin sections for polymerasechain reaction analysis: identification of cell populations usingimmunohistochemistry and in situ hybridization. J Mol Diagn, 3: 105-110,2001), (FIG. 1). To evaluate the purity of microdissected cellpopulations, expression of PLIN and FABP4, which are highly expressed inadipose tissue and mammary gland, was analyzed by gene expressionprofiles in 29 normal human tissues using a cDNA microarray(Saito-Hisaminato, A., et al., Genome-wide profiling of gene expressionin 29 normal human tissues with a cDNA microarray. DNA Res, 9: 35-45,2002). After the dissection procedure the proportion of contaminatingadipocytes among the normal breast ductal epithelial cells was estimatedto be smaller than 0.6%. In particular, when expression levels of PLINwere examined (Nishiu, J., et al., Isolation and chromosomal mapping ofthe human homolog of perilipin (PLIN), a rat adipose tissue-specificgene, by differential display method. Genomics, 48: 254-257, 1998), thepurity of cell populations subjected to the LMM technique couldtherefore be approximately 100%. As shown in FIG. 2, unsupervisedcluster analysis represented that breast cancer whole tissues wereseparated from microdissected breast cancer cells by LMM, whereas normalbreast ductal cells and mammary glands were clustered in the samebranch. Hence, to obtain accurately the breast cancer specificexpression profile in some studies, it is essential to microdissectbreast cancer cells and normal breast ductal epithelial cells from whichbreast cancer originates. The combined use of LMM and cDNA microarrayanalysis provides a powerful approach to elucidate precise molecularevents surrounding the development and progression of breast cancer, andlead to the understanding of the mechanism of multistep carcinogenesisof breast cancer cells and tumor heterogeneneity.

As shown in FIG. 2A, through an unsupervised classification analysis onthe basis of expression profiles, primary breast cancer can be dividedinto two groups and shown to associate with ER status by EIA. It wasdiscovered that ER+ and ER− tumors display very different geneexpression phenotypes. This result suggests that these twohistologically distinct lesions have different biological natures thatmay play an important role in carcinogenesis of breast cancer, andfurther suggests that ER status can be used to establish the necessityof hormone therapy in the adjuvant setting (Eifel, P., et al NationalInstitutes of Health Consensus Development Conference Statement:adjuvant therapy for breast cancer, Nov. 1-3, 2000. J Natl Cancer Inst,93: 979-989, 2001; Hartge, P. Genes, hormones, and pathways to breastcancer. N Engl J Med, 348: 2352-2354, 2003). In addition, throughsupervised statistical analysis, a subset of genes that were able toseparate ER-positive from ER-negative to investigate hormone dependentprogression were selected and novel molecular-target for anti-cancerdrug were explored. 97 genes whose expression is significantly differentbetween these two groups consisting of premenopausal patients wereidentified by a random permutation test (FIG. 3). Among these genes,MAP2K4, which is a centrally-placed mediator of the SAPK pathways, wasincluded. Cyclin D1, a gene that is strongly associated with ERexpression in breast cancer in this and other studies (May, F. E. andWestley, B. R. Expression of human intestinal trefoil factor inmalignant cells and its regulation by oestrogen in breast cancer cells.J Pathol, 182: 404-413, 1997), was also included. Estrogens areimportant regulators of growth and differentiation in the normal mammarygland and are also important in the development and progression ofbreast carcinoma (Shek, L. L. and Godolphin, W. Model for breast cancersurvival: relative prognostic roles of axillary nodal status, TNM stage,estrogen receptor concentration, and tumor necrosis. Cancer Res, 48:5565-5569, 1988). Estrogens regulate gene expression via ER; however,the details of the estrogen effect on downstream gene targets, the roleof cofactors, and cross-talk between other signaling pathways are farfrom fully understood. As approximately two-thirds of all breast cancersare ER+ at the time of diagnosis, the expression of the receptor hasimportant implications for their biology and therapy. Since recentlynovel selective estrogen receptor modulators (SERMs) have beendeveloping as hormonal treatment against ER-positive breast cancerpatients, these genes associated with ER status might be novel potentialmolecular-targets for SERMs (Smith, I. E. and Dowsett, M. Aromataseinhibitors in breast cancer. N Engl J Med, 348: 2431-2442, 2003). Thesefindings suggest that the comparison of expression profiles andER-status provides useful information to elucidate the hormonalregulation of cell proliferation and progression of ER-independentbreast cancer cells.

The development and use of molecular-based therapy for breast cancer andother human malignancies requires a detailed molecular genetic analysisof patient tissues. Histological evidence suggests that severalpre-neoplastic states exist that precede invasive breast tumors. Thesehistological lesions include atypical ductal hyperplasia, atypicallobular hyperplasia, ductal carcinoma in situ (DCIS), and lobularcarcinoma in situ (Lakhani, S. R. The transition from hyperplasia toinvasive carcinoma of the breast. J Pathol, 187: 272-278, 1999). Theselesions are thought to fall on a histological continuum between normalbreast epithelium or the terminal duct lobular units from which breastcancers arise, and the final invasive breast cancer. Several models havebeen proposed to explain the genetic abnormalities between pre-neoplasiaand neoplasia.

Various genes that showed commonly increased or decreased expressionamong the pathologically discrete stages, such as comparison of betweenDCIS and IDC, were observed, resulting in total identification of 325genes. These genes may underlie the molecular basis of the pathologicalgrade for breast cancer, and expression levels of these genes werecorrelated with advanced tumor grade. 78 commonly up-regulated genes(Table 3, 5) and 247 commonly down-regulated genes (Table 4, 6) in DCISand IDC were also identified. Among up-regulated genes, NAT1, HEC, GATA3and RAI3, which have been reported to be over-expressed in breastcancer, were noted as potentially expressed in preinvasive stages(Geylan, Y. S., et al., Arylamine N-acetyltransferase activities inhuman breast cancer tissues. Neoplasma, 48: 108-111, 2001; Chen, Y., etal., HEC, a novel nuclear protein rich in leucine heptad repeatsspecifically involved in mitosis. Mol Cell Biol, 17: 6049-6056, 1997;Bertucci, F., et al., Gene expression profiling of primary breastcarcinomas using arrays of candidate genes. Hum Mol Genet, 9: 2981-2991,2000; Cheng, Y. and Lotan, R. Molecular cloning and characterization ofa novel retinoic acid-inducible gene that encodes a putative Gprotein-coupled receptor. J Biol Chem, 273: 35008-35015, 1998). On theother hand, TGFBR2, included as a down-regulated gene in the presentinvention, is known to lead to reduced malignancy (Sun, L., et al.,Expression of transforming growth factor beta type II receptor leads toreduced malignancy in human breast cancer MCF-7 cells. J Biol Chem, 269:26449-26455, 1994). These findings suggest that these genes may beinvolved in transition from DCIS to IDC.

In particular, 25 up-regulated genes (Table 5) and 49 down-regulatedgenes (Table 6) were identified with elevated or decreased expressionaccording to transition from DCIS to IDC. The list of up-regulatedelements included genes encoding transcriptional factors and proteinsinvolved in the signal transduction pathway, and in the cell cycle, andthat play an important role in invasive tumorigenesis. Over-expressionof FoxM1 and cyclin B1 have been reported in various tumour types.Over-expression of FoxM1 stimulates cyclin B1 expression (Leung T W,2001). CCNB1 is a cell cycle control protein that is required forpassage through G2 and mitosis (Pines, J. and Hunter, T. Cyclins A andB1 in the human cell cycle. Ciba Found Symp, 170: 187-196; discussion196-204, 1992). TOP2A inhibitors are widely used as chemotherapeuticagents in lung cancer treatment (Miettinen, H. E., et al., Hightopoisomerase II alpha expression associates with high proliferationrate and poor prognosis in oligodendrogliomas. Neuropathol ApplNeurobiol, 26: 504-512, 2000). BUB1B may be responsible for achromosomal instability phenotype contributing to tumor progression inmitotic checkpoint and genetic instability (Bardelli, A., et al.Carcinogen-specific induction of genetic instability. Proc Natl Acad SciUSA, 98: 5770-5775, 2001). MMP11, its expression was shown to have adirect negative effect on patients' survival (Boulay, A., et al. Highcancer cell death in syngeneic tumors developed in host mice deficientfor the stromelysin-3 matrix metalloproteinase. Cancer Res, 61:2189-2193, 2001). ECM1 has angiogenic properties and is expressed bybreast tumor cells (Han, Z., et al., Extracellular matrix protein 1(ECM1) has angiogenic properties and is expressed by breast tumor cells.Faseb J, 15: 988-994, 2001). Although the most of these functions arestill unknown, evaluation of the functional analysis of these genes mayindicate that these play a role in mediating invasive activity.

In this report, through the precise expression profiles of breast cancerby means of genome wide cDNA microarray, we isolated novel genes, A7870that were significantly overexpressed in breast cancer cells, comparedto normal human tissues. Furthermore, we demonstrated treatment ofbreast cancer cells with siRNA effectively inhibited expression oftarget gene, A7870 and significantly suppressed cell/tumor growth ofbreast cancer. These findings suggest that A7870 might play key roles intumor cell growth proliferation, and might be promising targets fordevelopment of anti-cancer drugs.

A7870, designed to TOPK, a new member of the MAPKK family, is selectedfor study as its significant elevated-expression in breast cancer. Weidentified the approximately 1.8 and 1.9 kb transcripts showed cancerspecific expression. These transcripts have different sequence of 5′UTR, but same ORF. We demonstrated treatment of breast cancer cells withsiRNA effectively inhibited expression of A7870 and significantlysuppressed cell/tumor growth of breast cancer. These findings suggestthat A7870 might play key roles in tumor cell growth proliferation, andmight be promising targets for development of anti-cancer drugs.

The ability of some criteria to predict disease progression and clinicaloutcome is, however, imperfect. Patients with more aggressive diseasecan benefit from adjuvant chemotherapy or hormone therapy and arecurrently identified according to a combination of criteria: age, thesize of the tumor, axillary-node status, the histologic type andpathological grade of cancer, and hormone-receptor status.Histologically different tumors were classified by subset of genes, aprocess that provides pathologically relevant information. Mostinvestigators have suggested that patients have a poorer prognosis ifthe tumor showed a significantly higher percentage of poorlydifferentiated histology.

A surprising result from this study was the remarkable similarity in theexpression profiles of different histological type in each patient.Through microdissection and global gene expression analysis, changes ingene expression associated with invasion and prognosis were examinedusing mRNA expression profiles from breast cancer cells atwell-differentiated type and poorly differentiated type using supervisedanalysis. Through an unsupervised classification analysis on the basisof expression profiles, breast cancer can be divided into two groups andshown to associate with different pathologically lesions. 25 genes whoseexpression is significantly different between these two groupsconsisting of each patient were identified by a random permutation test(FIG. 2C). Among these genes, nuclear factor of activated T-cells 5(NFAT5) is restricted to promoting carcinoma cell migration, whichhighlights the possibility of distinct genes that are induced by thesetranscription factors (Sebastien J. et al., The role of NFATtranscription factors in integrin-mediated carcinoma invasion. Naturecell biology, 4: 540-544, 2002). Thrombospondin 2 (THSB2) isextracellular matrix proteins that appears to play a role in celladhesion and cell migration. One important advantage of the LMM-basedapproach is the ability to select cancer cells of different phenotypesfrom the one specimen. Systematic analysis of gene-expression patternsprovides a window on the biology and pathogenesis of invasion.

Furthermore, lymph-node metastasis is a critical step in tumorprogression and one of the major component of poor prognosis in breastcancer patients (Shek, L. L. and Godolphin, W. Model for breast cancersurvival: relative prognostic roles of axillary nodal status, TNM stage,estrogen receptor concentration, and tumor necrosis. Cancer Res, 48:5565-5569, 1988), but only a minority of patients exhibits clinicallydetectable metastases at diagnosis. Lymph-node status at diagnosis isthe most important measure for future recurrence and overall survival,it is a surrogate that is imperfect at best. About a third of patientswith no detectable lymph-node involvement, for example, will developrecurrent disease within 10 years (Saphner, T., et al., Annual hazardrates of recurrence for breast cancer after primary therapy. J ClinOncol, 14: 2738-2746, 1996). Sentinel lymph node biopsy was shown to bean accurate procedure in the study of axillary lymph nodes; it allowed amarked decrease in surgery-related morbidity of breast cancer andaxillary dissection could be avoided. Other parameters, such as nucleargrading, patient age, tumor size, are not able to predict the axillarylymph node status, and it is not possible to effectively diagnose lymphnode status by sentinel lymph node biopsy. Therefore, the presentidentification of a subset of genes differentially expressed betweennode-positive and node-negative tumors can contribute to improveclinical diagnosis and understanding of the precise biophysical events.Cluster analysis (FIG. 10) suggested to separate cases with lymph-nodemetastasis from those without metastasis. The genes that contributed toseparation of the two patient groups according to the status oflymph-node metastasis may serve as molecular markers for metastasis(Ramaswamy, S., et al., A molecular signature of metastasis in primarysolid tumors. Nat Genet, 33: 49-54, 2003). For example, among these 34genes, FUS which is known as TLS for translocated in liposarcoma, isdecreased in node-negative cancers is translocated with the geneencoding the transcription factor ERG-1 in human myeloid leukaemias. Oneof the important functions of wild-type FUS is genome maintenance,particularly the maintenance of genomic stability (Hicks, G. G., et al.,Fus deficiency in mice results in defective B-lymphocyte development andactivation, high levels of chromosomal instability and perinatal death.Nat Genet, 24: 175-179, 2000). Expression levels were increased for someof the genes in the metastasis-positive group as compared to thenegative group. For example, regarding EEF1D, the higher expression ofEF-1 delta in the tumours suggested that malignant transformation invivo requires an increase in translation factor mRNA and proteinsynthesis for entry into and transition through the cell cycle. CFL1,Rho protein signal transduction, and Rho family GTPases regulate thecytoskeleton and cell migration and are frequently overexpressed intumours (Yoshizaki, H., et al., Activity of Rho-family GTPases duringcell division as visualized with FRET-based probes. J Cell Biol, 162:223-232, 2003; Arthur, W. T., et al., Regulation of Rho family GTPasesby cell-cell and cell-matrix adhesion. Biol Res, 35: 239-246, 2002).BRAF, the B-Raf kinase, was shown to be capable of phosphorylating andactivating MEK as a result of growth factor stimulation. Although thefunction of some of these genes is still unknown, understanding thefunction of these gene products may clarify their roles in metastasis inbreast cancer.

The causes and clinical course of recurrence are presently unknown.Furthermore, it is not possible to predict outcome reliably on the basisof available clinical, pathological, and genetic markers. Although it isbelieved that the predicting score system of the present invention,using the expression profiles of these 34 genes, may be useful forimprovement of prognosis, verification using a larger number of casesmay be needed for introduction into clinical stages. In any event, thepresent invention appears to provide precise information about thebiological nature of cancer cells that have been misunderstood byconventional histological diagnosis.

Cancer therapies directed at specific molecular alterations that occurin cancer cells have been validated through clinical development andregulatory approval of anti-cancer drugs such as trastuzumab (Herceptin)for the treatment of advanced breast cancer (Coussens, L., et al.Tyrosine kinase receptor with extensive homology to EGF receptor shareschromosomal location with neu oncogene. Science, 230: 1132-1139, 1985).This drug is clinically effective and better tolerated than traditionalanti-cancer agents because it targets only transformed cells. Hence,this drug not only improves survival and quality of life for cancerpatients, but also validates the concept of molecularly targeted cancertherapy. Furthermore, targeted drugs can enhance the efficacy ofstandard chemotherapy when used in combination therewith (Gianni, L. andGrasselli, G. Targeting the epidermal growth factor receptor a newstrategy in cancer treatment. Suppl Tumori, 1: S60-61, 2002; Klejman,A., et al., Phosphatidylinositol-3 kinase inhibitors enhance theanti-leukemia effect of STI571. Oncogene, 21: 5868-5876, 2002).Therefore, future cancer treatments will probably involve combiningconventional drugs with target-specific agents aimed at differentcharacteristics of tumor cells such as angiogenesis and invasiveness.Furthermore, the present invention demonstrates that the novel tumormarkers, substances that may be present in abnormal amounts in theblood, or nipple aspirates of a woman who has breast cancer, may bereliable enough to be used routinely to detect early breast cancer.

Currently, no effective treatment is available for patients in advancedbreast cancer. Thus, new therapeutic approaches and tailor-madetreatment are urgently required. The cancer-specific expression profilesof the present invention, including up- and down-regulated genes inbreast cancers, should provide useful information for identifyingmolecular targets for the treatment of patents.

TABLE 1 List of genes with altered expression between well and poorlydifferentiated type in histological phenotype BRC ACCESSION NO. NO.Symbol TITLE p-value 1 AF053712 TNFSF11 tumor necrosis factor (ligand)1.2E−06 superfamily, member 11 2 BF973104 LOC201725 hypothetical proteinLOC201725 3.2E−05 3 AV752313 KPNA6 karyopherin alpha 6 (importin alpha7) 1.1E−04 4 AK026898 FOXP1 forkhead box P1 7.4E−04 5 AA148107 ITGA5integrin, alpha 5 (fibronectin receptor, 7.9E−04 alpha polypeptide) 6AK001067 NFAT5 nuclear factor of activated T-cells 5, 8.2E−04tonicity-responsive 7 AB007919 KIAA0450 KIAA0450 gene product 1.8E−03 8BG026429 SFRS2 splicing factor, arginine/serine-rich 2 2.0E−03 9 M87770FGFR2 fibroblast growth factor receptor 2 2.1E−03 (bacteria-expressedkinase, keratinocyte growth factor receptor, craniofacial dysostosis 1,Crouzon syndrome, Pfeiffer syndrome, Jackson-Weiss syndrome) 10 L02785SLC26A3 solute carrier family 26, member 3 2.7E−03 11 BF037402 Homosapiens, clone MGC: 17296 2.8E−03 IMAGE: 3460701, mRNA, complete cds 12L12350 THBS2 thrombospondin 2 2.8E−03 13 N36875 Homo sapiens, cloneIMAGE: 4994678, 3.8E−03 mRNA 14 AL135342 ESTs, Weakly similar toneuronal thread 4.3E−03 protein [Homo sapiens] [H. sapiens] 15 AL049426SDC3 syndecan 3 (N-syndecan) 4.5E−03 16 AW961424 KIAA1870 KIAA1870protein 5.2E−03 17 AA523117 DC-TM4F2 tetraspanin similar to TM4SF95.5E−03 18 Z11531 EEF1G eukaryotic translation elongation factor 16.1E−03 gamma 19 AI423028 SMARCD3 SWI/SNF related, matrix associated,6.8E−03 actin dependent regulator of chromatin, subfamily d, member 3 20AB002391 MN7 D15F37 (pseudogene) 7.1E−03 21 D32050 AARS alanyl-tRNAsynthetase 7.2E−03 22 BE876949 RAB7 RAB7, member RAS oncogene family7.9E−03 23 AW291083 ESTs 8.0E−03 24 AI568910 ESTs 8.2E−03 25 AK023480SRP72 signal recognition particle 72 kDa 8.7E−03

TABLE 2 List of genes with altered expression between ER-positive andER-negative tumors BRC ACCESSION NO. NO. Symbol TITLE p-value 26AW949747 GATA3 GATA binding protein 3 3.2E−20 27 BE868254 ESTs ESTs2.2E−14 28 AF037335 CA12 carbonic anhydrase XII 1.6E−13 29 BF724977ASB13 ankyrin repeat and SOCS box-containing 13 8.5E−13 30 NM_004636SEMA3B sema domain, immunoglobulin domain (Ig), 9.7E−13 short basicdomain, secreted, (semaphorin) 3B 31 NM_000125 ESR1 estrogen receptor 11.2E−12 32 M73554 CCND1 cyclin D1 (PRAD1: parathyroid adenomatosis 1)3.9E−12 33 NM_005544 IRS1 insulin receptor substrate 1 4.4E−12 34 M14745BCL2 B-cell CLL/lymphoma 2 5.1E−12 35 BE826171 BCMP11 breast cancermembrane protein 11 2.8E−11 36 AI087270 SIAH2 seven in absentia homolog2 (Drosophila) 2.8E−11 37 L07033 HMGCL3-hydroxymethyl-3-methylglutaryl-Coenzyme A 2.8E−11 lyase(hydroxymethylglutaricaciduria) 38 AB014523 ULK2 unc-51-like kinase 2(C. elegans) 4.0E−11 39 AL137588 DKFZp434K1210 hypothetical proteinDKFZp434K1210 5.2E−11 40 AL137566 EST Homo sapiens mRNA; cDNADKFZp586G0321 5.4E−11 (from clone DKFZp586G0321) 41 AF038421 GFRA1 GDNFfamily receptor alpha 1 8.4E−11 42 AI194045 FE65L2 FE65-like protein 29.2E−11 43 BG163478 ESTs ESTs, Weakly similar to BAI1_HUMAN Brain-1.1E−10 specific angiogenesis inhibitor 1 precursor [H. sapiens] 44M31627 XBP1 X-box binding protein 1 1.1E−10 AA156269 EST Homo sapiens,clone IMAGE: 4794107, mRNA 1.3E−10 46 NM_006763 BTG2 BTG family, member2 1.9E−10 47 AW504052 SEC15L SEC15 (S. cerevisiae)-like 2.1E−10 48NM_005400 PRKCE protein kinase C, epsilon 2.3E−10 49 AI628151 XBP1 X-boxbinding protein 1 2.7E−10 50 AF043045 FLNB filamin B, beta (actinbinding protein 278) 3.5E−10 51 U31383 GNG10 guanine nucleotide bindingprotein (G protein), 4.6E−10 gamma 10 52 L10333 RTN1 reticulon 1 5.6E−1053 AK025099 SIGIRR single Ig IL-1R-related molecule 6.2E−10 54 AL039253LIV-1 LIV-1 protein, estrogen regulated 7.4E−10 55 AW949662 KIAA0239KIAA0239 protein 8.0E−10 56 D13629 KTN1 kinectin 1 (kinesin receptor)1.5E−09 57 NM_000165 GJA1 gap junction protein, alpha 1, 43 kDa(connexin 1.5E−09 43) 58 AA533079 C1orf21 chromosome 1 open readingframe 21 1.8E−09 59 AF251056 CAPS2 calcyphosphine 2 1.9E−09 60 AF061016UGDH UDP-glucose dehydrogenase 2.0E−09 61 U92544 MAGED2 melanomaantigen, family D, 2 2.1E−09 62 BE617536 RPL13A ribosomal protein L13a2.4E−09 63 AK024102 MYST1 MYST histone acetyltransferase 1 2.5E−09 64BF212902 EST Homo sapiens mRNA; cDNA DKFZp564F053 2.8E−09 (from cloneDKFZp564F053) 65 AK025480 FLJ21827 hypothetical protein FLJ21827 3.0E−0966 AI376713 ESTs ESTs, Weakly similar to hypothetical protein 3.6E−09FLJ20378 [Homo sapiens] [H. sapiens] 67 AI028483 ESTs ESTs 3.8E−09 68AK022249 EST Homo sapiens cDNA FLJ12187 fis, clone 4.2E−09 MAMMA1000831.69 AI568527 EST Homo sapiens cDNA FLJ34849 fis, clone 5.0E−09NT2NE2011687. 70 AL133074 TP53INP1 tumor protein p53 inducible nuclearprotein 1 5.3E−09 71 AF022116 PRKAB1 protein kinase, AMP-activated, beta1 non- 6.1E−09 catalytic subunit 72 AF007170 C1orf34 chromosome 1 openreading frame 34 9.7E−09 73 AF042081 SH3BGRL SH3 domain binding glutamicacid-rich protein 1.2E−08 like 74 AK027813 MGC10744 hypothetical proteinMGC10744 1.4E−08 75 M57609 GLI3 GLI-Kruppel family member GLI3 (Greig1.7E−08 cephalopolysyndactyly syndrome) 76 AL359600 EST Homo sapiensmRNA; cDNA DKFZp547C136 1.9E−08 (from clone DKFZp547C136) 77 BQ006049TIMP1 tissue inhibitor of metalloproteinase 1 (erythroid 2.1E−08potentiating activity, collagenase inhibitor) 78 AF111849 HELO1 homologof yeast long chain polyunsaturated 2.2E−08 fatty acid elongation enzyme2 79 AL157499 RAB5EP rabaptin-5 2.2E−08 80 AK023199 EST Homo sapienscDNA FLJ13137 fis, clone 2.5E−08 NT2RP3003150. 81 J05176 SERPINA3 serine(or cysteine) proteinase inhibitor, clade A 3.2E−08 (alpha-1antiproteinase, antitrypsin), member 3 82 AA028101 KIAA0303 KIAA0303protein 3.3E−08 83 AI300588 MAP2K4 mitogen-activated protein kinasekinase 4 4.1E−08 84 AA682861 ESTs ESTs, Moderately similar tohypothetical 4.6E−08 protein FLJ20378 [Homo sapiens] [H. sapiens] 85M26393 ACADS acyl-Coenzyme A dehydrogenase, C-2 to C-3 5.4E−08 shortchain 86 NM_001609 ACADSB acyl-Coenzyme A dehydrogenase, 5.5E−08short/branched chain 87 U91543 CHD3 chromodomain helicase DNA bindingprotein 3 5.7E−08 88 AK023813 FLJ10081 hypothetical protein FLJ100816.0E−08 89 BF111711 FLJ20727 hypothetical protein FLJ20727 7.0E−08 90AL049987 EST Homo sapiens mRNA; cDNA DKFZp564F112 7.2E−08 (from cloneDKFZp564F112) 91 AW081894 EST EST 8.2E−08 92 AK000350 FLJ20343hypothetical protein FLJ20343 1.1E−07 93 AA418493 DPP7dipeptidylpeptidase 7 1.1E−07 94 BE674061 PIN4 protein (peptidyl-prolylcis/trans isomerase) 1.2E−07 NIMA-interacting, 4 (parvulin) 95 AB011155DLG5 discs, large (Drosophila) homolog 5 1.2E−07 96 L15203 TFF3 trefoilfactor 3 (intestinal) 1.4E−07 97 NM_001552 IGFBP4 insulin-like growthfactor binding protein 4 1.4E−07 98 M57230 IL6ST interleukin 6 signaltransducer (gp130, 1.5E−07 oncostatin M receptor) 99 N92706 EST Homosapiens cDNA FLJ38461 fis, clone 1.5E−07 FEBRA2020977. 100 M30704 AREGamphiregulin (schwannoma-derived growth 1.8E−07 factor) 101 AB004066BHLHB2 basic helix-loop-helix domain containing, class 2.2E−07 B, 2 102M15518 PLAT plasminogen activator, tissue 2.3E−07 103 BM697477 ShrmLShroom-related protein 2.4E−07 104 R45979 CELSR1 cadherin, EGF LAGseven-pass G-type receptor 3.0E−07 1 (flamingo homolog, Drosophila) 105AL049365 EST Homo sapiens mRNA; cDNA DKFZp586A0618 6.5E−07 (from cloneDKFZp586A0618) 106 NM_003225 TFF1 trefoil factor 1 (breast cancer,estrogen-inducible 7.1E−07 sequence expressed in) 107 AI733356 EST Homosapiens cDNA FLJ31746 fis, clone 7.8E−07 NT2RI2007334. 108 AF078853KIAA1243 KIAA1243 protein 8.2E−07 109 N30179 PLAB prostatedifferentiation factor 1.0E−06 110 BG026429 SFRS2 splicing factor,arginine/serine-rich 2 2.4E−06 111 AU149272 ESTs ESTs 2.5E−06 112 J03827NSEP1 nuclease sensitive element binding protein 1 3.0E−06 113 AJ276469C20orf35 chromosome 20 open reading frame 35 3.4E−06 114 AW295100LOC201562 hypothetical protein LOC201562 3.9E−06 115 J03817 GSTM1glutathione S-transferase M1 4.8E−06 116 AF288571 LEF1 lymphoidenhancer-binding factor 1 5.1E−06 117 AF069301 PECI peroxisomalD3,D2-enoyl-CoA isomerase 5.3E−06 118 AA621665 EST EST 6.7E−06 119AI739486 ESTs ESTs 8.0E−06 120 X81438 AMPH amphiphysin (Stiff-Mansyndrome with breast 8.7E−06 cancer 128 kDa autoantigen) 121 U89606 PDXKpyridoxal (pyridoxine, vitamin B6) kinase 8.8E−06 122 NM_017555 EGLN2egl nine homolog 2 (C. elegans) 9.2E−06

TABLE 3 Genes commonly up-regulated in DCIS and IDC BRC NO. ACCESSIONNO. Symbol TITLE 123 D90041 NAT1 N-acetyltransferase 1 (arylamine N-acetyltransferase) 124 M13755 G1P2 interferon, alpha-inducible protein(clone IFI- 15K) 125 D88308 SLC27A2 solute carrier family 27 (fatty acidtransporter), member 2 126 AW235061 NM_004170 SLC1A1 solute carrierfamily 1 (neuronal/epithelial high affinity glutamate transporter,system Xag), member 1 127 K02215 AGT angiotensinogen (serine (orcysteine) proteinase inhibitor, clade A (alpha-1 antiproteinase,antitrypsin), member 8) 128 AB032261 SCD stearoyl-CoA desaturase(delta-9-desaturase) 129 NM_000909 NPY1R neuropeptide Y receptor Y1 130AF017790 HEC highly expressed in cancer, rich in leucine heptad repeats131 NM_007019 UBE2C ubiquitin-conjugating enzyme E2C 132 AF065388TSPAN-1 tetraspan 1 133 N70334 DUSP10 dual specificity phosphatase 10134 AA621719 NM_005496 SMC4L1 SMC4 structural maintenance of chromosomes4-like 1 (yeast) 135 AA676987 ESTs 136 AK001402 NM_018131 C10orf3chromosome 10 open reading frame 3 137 AW949747 NM_002051 GATA3 GATAbinding protein 3 138 AK001472 NM_018685 ANLN anillin, actin bindingprotein (scraps homolog, Drosophila) 139 AA789233 NM_000088 COL1A1collagen, type I, alpha 1 140 AF070632 Homo sapiens clone 24405 mRNAsequence 141 H04544 NPY1R neuropeptide Y receptor Y1 142 AI015982 CDCA1cell division cycle associated 1 143 NM_003979 RAI3 retinoic acidinduced 3 144 BF516445 NM_053277 CLIC6 chloride intracellular channel 6145 AI361654 146 AI077540 NM_178530 Homo sapiens cDNA FLJ38379 fis,clone FEBRA2002986. 147 AI261804 Homo sapiens MSTP020 (MST020) mRNA,complete cds 148 AK026559 TPM3 tropomyosin 3 149 J03473 ADPRTADP-ribosyltransferase (NAD+; poly (ADP- ribose) polymerase) 150NM_000187 HGD homogentisate 1,2-dioxygenase (homogentisate oxidase) 151L43964 PSEN2 presenilin 2 (Alzheimer disease 4) 152 J05581 MUC1 mucin 1,transmembrane 153 AA602499 XM_379784 GLCCI1 glucocorticoid inducedtranscript 1 154 U37707 MPP3 membrane protein, palmitoylated 3 (MAGUKp55 subfamily member 3) 155 AB030905 CBX3 chromobox homolog 3 (HP1 gammahomolog, Drosophila) 156 AL138409 NM_198278 Homo sapiens mRNA; cDNADKFZp313L231 (from clone DKFZp313L231) 157 AV756928 SEC61G Sec61 gamma158 AI205684 NM_021979 HSPA2 heat shock 70 kDa protein 2 159 BE739464NM_015161 ARL6IP ADP-ribosylation factor-like 6 interacting protein 160AI081356 NM_203463 LOC253782 hypothetical protein LOC253782 161 AA167194LOC253782 hypothetical protein LOC253782 162 M90516 GFPT1glutamine-fructose-6-phosphate transaminase 1 163 AL133074 NM_033285TP53INP1 tumor protein p53 inducible nuclear protein 1 164 AL137257 Homosapiens, clone IMAGE: 5296692, mRNA 165 AK025240 NM_147128 LOC223082LOC223082 166 AJ007042 WHSC1 Wolf-Hirschhorn syndrome candidate 1 167U42068 GRP58 glucose regulated protein, 58 kDa 168 AJ132592 ZNF281 zincfinger protein 281 169 W93638 ESTs AW977394 C9orf12 chromosome 9 openreading frame 12 171 AI347925 NM_001540 HSPB1 heat shock 27 kDa protein1 172 AK026587 NET-6 transmembrane 4 superfamily member tetraspan NET-6173 AI264621 LASS2 LAG1 longevity assurance homolog 2 (S. cerevisiae)174 AA767828 XM_035527 FLJ10980 hypothetical protein FLJ10980 175AU142881 NM_018184 FLJ10702 hypothetical protein FLJ10702

TABLE 4 Genes commonly down-regulated in DCIS and IDC BRC NO. ACCESSIONNO. Symbol TITLE 176 X52186 ITGB4 integrin, beta 4 177 NM_006297 XRCC1X-ray repair complementing defective repair in Chinese hamster cells 1178 X73460 RPL3 ribosomal protein L3 179 NM_001436 FBL fibrillarin 180X59373 HOXD10 homeo box D10 181 J04208 IMPDH2 IMP (inosinemonophosphate) dehydrogenase 2 182 L24203 TRIM29 tripartitemotif-containing 29 183 L10340 NM_001958 EEF1A2 eukaryotic translationelongation factor 1 alpha 2 184 J04621 SDC2 syndecan 2 (heparan sulfateproteoglycan 1, cell surface-associated, fibroglycan) 185 L08424 ASCL1achaete-scute complex-like 1 (Drosophila) 186 AI376713 EST ESTs, Weaklysimilar to hypothetical protein FLJ20378 [Homo sapiens] [H. sapiens] 187AK026966 EST Homo sapiens cDNA: FLJ23313 fis, clone HEP11919. 188NM_001050 SSTR2 somatostatin receptor 2 189 AA632025 EST ESTs 190 N22918NM_144641 FLJ32332 hypothetical protein FLJ32332 191 AF272043 ITM2Cintegral membrane protein 2C 192 M58459 RPS4Y ribosomal protein S4,Y-linked 193 AI133697 EST Homo sapiens, clone MGC: 16362 IMAGE: 3927795,mRNA, complete cds 194 AA780301 NM_003793 CTSF cathepsin F 195 M92843ZFP36 zinc finger protein 36, C3H type, homolog (mouse) 196 AA570186 ESTHuman full-length cDNA 5-PRIME end of clone CS0DK007YB08 of HeLa cellsof Homo sapiens (human) 197 R56906 EST EST 198 AF208860 NM_014452TNFRSF21 tumor necrosis factor receptor superfamily, member 21 199AK025216 TAZ transcriptional co-activator with PDZ-binding motif (TAZ)200 AA758394 PTPN1 protein tyrosine phosphatase, non-receptor type 1 201AA628530 NM_016368 ISYNA1 myo-inositol 1-phosphate synthase A1 202AF161416 NM_003749 IRS2 insulin receptor substrate 2 203 AL045916 ESTESTs 204 AW340972 EST Homo sapiens cDNA: FLJ22864 fis, clone KAT02164.205 AI189414 RNPC2 RNA-binding region (RNP1, RRM) containing 2 206AV705636 EIF3S6IP eukaryotic translation initiation factor 3, subunit 6interacting protein 207 U28977 CASP4 caspase 4, apoptosis-relatedcysteine protease 208 AV708528 NM_018579 MSCP mitochondrial solutecarrier protein 209 AA022956 NM_024667 FLJ12750 hypothetical proteinFLJ12750 210 AI928443 EST Homo sapiens cDNA FLJ38855 fis, cloneMESAN2010681. 211 U14966 RPL5 ribosomal protein L5 212 AI857997 TPBGtrophoblast glycoprotein 213 BF697545 MGP matrix Gla protein 214AW575754 NM_152309 FLJ35564 hypothetical protein FLJ35564 215 AI352534NM_001753 CAV1 caveolin 1, caveolae protein, 22 kDa 216 NM_001985 ETFBelectron-transfer-flavoprotein, beta polypeptide 217 AI743134 NM_006216SERPINE2 serine (or cysteine) proteinase inhibitor, clade E (nexin,plasminogen activator inhibitor type 1), member 2 218 AW444709 NM_001777CD47 CD47 antigen (Rh-related antigen, integrin- associated signaltransducer) 219 BF688910 NM_001300 COPEB core promoter element bindingprotein 220 AI818579 NM_181847 EST Homo sapiens, clone IMAGE: 3625286,mRNA, partial cds 221 S95936 TF transferrin 222 AF074393 RPS6KA5ribosomal protein S6 kinase, 90 kDa, polypeptide 5 223 NM_000591 CD14CD14 antigen 224 AK027181 NM_031426 IBA2 ionized calcium binding adaptermolecule 2 225 X73079 PIGR polymeric immunoglobulin receptor 226NM_001343 DAB2 disabled homolog 2, mitogen-responsive phosphoprotein(Drosophila) 227 M31452 C4BPA complement component 4 binding protein,alpha 228 X07696 KRT15 keratin 15 229 AF016004 GPM6B glycoprotein M6B230 NM_004078 CSRP1 cysteine and glycine-rich protein 1 231 L36645 EPHA4EphA4 232 D78011 DPYS dihydropyrimidinase 233 W60630 NM_032801 JAM3junctional adhesion molecule 3 234 AW956111 D4S234E DNA segment onchromosome 4 (unique) 234 expressed sequence 235 AF035752 CAV2 caveolin2 236 D37766 LAMB3 laminin, beta 3 237 U66406 EFNB3 ephrin-B3 238 X52001EDN3 endothelin 3 239 NM_000856 GUCY1A3 guanylate cyclase 1, soluble,alpha 3 240 U60115 FHL1 four and a half LIM domains 1 241 D14520NM_001730 KLF5 Kruppel-like factor 5 (intestinal) 242 M99487 FOLH1folate hydrolase (prostate-specific membrane antigen) 1 243 U09873 FSCN1fascin homolog 1, actin-bundling protein (Strongylocentrotus purpuratus)244 AF017418 MEIS2 Meis1, myeloid ecotropic viral integration site 1homolog 2 (mouse) 245 AF038540 NM_206900 RTN2 reticulon 2 246 AF049884NM_021069 ARGBP2 Arg/Abl-interacting protein ArgBP2 247 NM_001122 ADFPadipose differentiation-related protein 248 Y09926 MASP2 mannan-bindinglectin serine protease 2 249 M58297 ZNF42 zinc finger protein 42(myeloid-specific retinoic acid-responsive) 250 AF035811 PNUTL2peanut-like 2 (Drosophila) 251 L22214 ADORA1 adenosine A1 receptor 252AF177775 CES1 carboxylesterase 1 (monocyte/macrophage serine esterase 1)253 U07643 LTF lactotransferrin 254 S76474 NM_006180 NTRK2 neurotrophictyrosine kinase, receptor, type 2 255 BE299605 NM_012219 MRAS muscle RASoncogene homolog 256 NM_006225 PLCD1 phospholipase C, delta 1 257NM_005036 PPARA peroxisome proliferative activated receptor, alpha 258M22324 ANPEP alanyl (membrane) aminopeptidase (aminopeptidase N,aminopeptidase M, microsomal aminopeptidase, CD13, p150) 259 BE877416TGFBR2 transforming growth factor, beta receptor II (70/80 kDa) 260BE561244 RPL18A ribosomal protein L18a 261 AL048962 EST Homo sapiens,clone IMAGE: 4243767, mRNA 262 L08895 MEF2C MADS box transcriptionenhancer factor 2, polypeptide C (myocyte enhancer factor 2C) 263 U48707PPP1R1A protein phosphatase 1, regulatory (inhibitor) subunit 1A 264X56134 RPLP2 ribosomal protein, large P2 265 D84239 FCGBP Fc fragment ofIgG binding protein 266 AK026181 PHLDA1 pleckstrin homology-like domain,family A, member 1 267 K01144 CD74 CD74 antigen (invariant polypeptideof major histocompatibility complex, class II antigen- associated) 268U25138 KCNMB1 potassium large conductance calcium-activated channel,subfamily M, beta member 1 269 X85337 NM_053025 MYLK myosin, lightpolypeptide kinase 270 D83597 LY64 lymphocyte antigen 64 homolog,radioprotective 105 kDa (mouse) 271 NM_004024 ATF3 activatingtranscription factor 3 272 BF126636 SAA1 serum amyloid A1 273 D13789MGAT3 mannosyl (beta-1,4-)-glycoprotein beta-1,4-N-acetylglucosaminyltransferase 274 L41142 STAT5A signal transducer andactivator of transcription 5A 275 AB040969 KIAA1536 KIAA1536 protein 276NM_002153 HSD17B2 hydroxysteroid (17-beta) dehydrogenase 2 277 AV646610NM_001546 ID4 inhibitor of DNA binding 4, dominant negativehelix-loop-helix protein 278 X03663 CSF1R colony stimulating factor 1receptor, formerly McDonough feline sarcoma viral (v-fms) oncogenehomolog 279 U47025 PYGB phosphorylase, glycogen; brain 280 M81349 SAA4serum amyloid A4, constitutive 281 AI264201 NM_000399 EGR2 early growthresponse 2 (Krox-20 homolog, Drosophila) 282 U18018 ETV4 ets variantgene 4 (E1A enhancer binding protein, E1AF) 283 NM_004350 RUNX3runt-related transcription factor 3 284 BF337516 CRYAB crystallin, alphaB 285 AF027208 PROML1 prominin-like 1 (mouse) 286 D17408 CNN1 calponin1, basic, smooth muscle 287 NM_004010 DMD dystrophin (musculardystrophy, Duchenne and Becker types) 288 BF183952 CSTA cystatin A(stefin A) 289 M16445 CD2 CD2 antigen (p50), sheep red blood cellreceptor 290 AF055015 EYA2 eyes absent homolog 2 (Drosophila) 291AI745624 ELL2 ELL-related RNA polymerase II, elongation factor 292AK025329 DKFZP566H073 DKFZP566H073 protein 293 BE745465 NM_012427 KLK5kallikrein 5 294 AK024578 NM_031455 DKFZP761F241 hypothetical proteinDKFZp761F241 295 AI870306 XM_380171 IRX1 iroquois homeobox protein 1 296H37853 NM_022343 C9orf19 chromosome 9 open reading frame 19 297 BF000047EST Homo sapiens full length insert cDNA clone ZA79C08 298 AF126780RetSDR2 retinal short-chain dehydrogenase/reductase 2 299 AI700341 ESTESTs, Weakly similar to hypothetical protein FLJ20489 [Homo sapiens] [H.sapiens] 300 M87770 FGFR2 fibroblast growth factor receptor 2 (bacteria-expressed kinase, keratinocyte growth factor receptor, craniofacialdysostosis 1, Crouzon syndrome, Pfeiffer syndrome, Jackson-Weisssyndrome) 301 AA452368 NM_144595 FLJ30046 hypothetical protein FLJ30046302 NM_021200 PLEKHB1 pleckstrin homology domain containing, family B(evectins) member 1 303 AK026343 hIAN2 human immune associatednucleotide 2 304 AF251040 C5orf6 chromosome 5 open reading frame 6 305M87507 CASP1 caspase 1, apoptosis-related cysteine protease (interleukin1, beta, convertase) 306 M97675 ROR1 receptor tyrosine kinase-likeorphan receptor 1 307 NM_020549 CHAT choline acetyltransferase 308X00457 NM_033554 HLA-DPA1 major histocompatibility complex, class II, DPalpha 1 309 W72411 NM_003722 TP73L tumor protein p73-like 310 AI769569EST ESTs 311 K02765 C3 complement component 3 312 AW971490 FLJ14906hypothetical protein FLJ14906 313 AF077044 RPAC2 likely ortholog ofmouse RNA polymerase 1-3 (16 kDa subunit) 314 H70803 NM_015278 KIAA0790KIAA0790 protein 315 AL050367 XM_167709 LOC221061 hypothetical proteinLOC221061 316 AK001643 NM_018215 FLJ10781 hypothetical protein FLJ10781317 AW182273 EST Homo sapiens cDNA FLJ31517 fis, clone NT2RI2000007. 318W67951 EST Human S6 A-5 mRNA expressed in chromosome 6-suppressedmelanoma cells. 319 AL117605 EST Homo sapiens mRNA; cDNA DKFZp564N1063(from clone DKFZp564N1063) 320 AI376418 EST Homo sapiens cDNA FLJ35169fis, clone PLACE6012908. 321 AA683373 EST EST 322 AK022877 EST Homosapiens cDNA FLJ12815 fis, clone NT2RP2002546. 323 NM_002258 KLRB1killer cell lectin-like receptor subfamily B, member 1 324 M69225 BPAG1bullous pemphigoid antigen 1, 230/240 kDa 325 AW299572 NM_015461 EHZFearly hematopoietic zinc finger 326 BE044467 NM_005737 ARL7ADP-ribosylation factor-like 7 327 AA938297 NM_017938 FLJ20716hypothetical protein FLJ20716 328 AA706316 NM_033317 ZD52F10hypothetical gene ZD52F10 329 AI827230 NM_153000 APCDD1 adenomatosispolyposis coli down-regulated 1 330 AK000251 FLJ20244 hypotheticalprotein FLJ20244 331 N62352 NM_020925 KIAA1573 KIAA1573 protein 332H53164 ICSBP1 interferon consensus sequence binding protein 1 333BE394824 WFDC2 WAP four-disulfide core domain 2 334 AL117462 NM_015481ZFP385 likely ortholog of mouse zinc finger protein 385 335 NM_003186TAGLN transgelin 336 U58514 CHI3L2 chitinase 3-like 2 337 AB026125 ART-4ART-4 protein 338 AL080059 NM_033512 KIAA1750 KIAA1750 protein 339AA747005 SDCCAG43 serologically defined colon cancer antigen 43 340NM_005928 MFGE8 milk fat globule-EGF factor 8 protein 341 D62470NM_004796 NRXN3 neurexin 3 342 N29574 RAGD Rag D protein 343 K02276 MYCv-myc myelocytomatosis viral oncogene homolog (avian) 344 D78611 MESTmesoderm specific transcript homolog (mouse) 345 NM_022003 FXYD6 FXYDdomain containing ion transport regulator 6 346 BF508973 RPL13 ribosomalprotein L13 347 NM_001615 ACTG2 actin, gamma 2, smooth muscle, enteric348 R41532 EST ESTs, Weakly similar to POL2_MOUSE Retrovirus-related POLpolyprotein [Contains: Reverse transcriptase; Endonuclease] [M.musculus] 349 AA142875 EST ESTs 350 U03688 CYP1B1 cytochrome P450,family 1, subfamily B, polypeptide 1 351 W94363 EST Homo sapiens fulllength insert cDNA clone ZE12G01 352 W44613 HSJ001348 cDNA fordifferentially expressed CO16 gene 353 AL118812 EST Homo sapiens mRNA;cDNA DKFZp761G1111 (from clone DKFZp761G1111) 354 D56064 MAP2microtubule-associated protein 2 355 BF966838 NM_172069 KIAA2028 similarto PH (pleckstrin homology) domain 356 AI338625 NM_014344 FJX1 fourjointed box 1 (Drosophila) 357 AI263022 EST ESTs 358 AL050107 NM_015472TAZ transcriptional co-activator with PDZ-binding motif (TAZ) 359AI056364 NM_033210 FLJ14855 hypothetical protein FLJ14855 360 AI351898NM_032581 DRCTNNB1A down-regulated by Ctnnb1, a 361 AV700003 ARL6IP2ADP-ribosylation-like factor 6 interacting protein 2 362 NM_000700 ANXA1annexin A1 363 M81141 HLA-DQB1 major histocompatibility complex, classII, DQ beta 1 364 AI598227 NM_024911 FLJ23091 hypothetical proteinFLJ23091 365 BG034740 ROPN1 ropporin, rhophilin associated protein 1 366AB011175 TBC1D4 TBC1 domain family, member 4 367 AK024449 PP2135 PP2135protein 368 AW978770 DKFZP566A1524 hypothetical protein DKFZp566A1524369 AI821113 EST Homo sapiens cDNA FLJ36327 fis, clone THYMU2005748. 370AI057450 SLC13A2 solute carrier family 13 (sodium-dependentdicarboxylate transporter), member 2 371 X86693 SPARCL1 SPARC-like 1(mast9, hevin) 372 AI224952 NM_173640 FLJ40906 hypothetical proteinFLJ40906 373 D13639 CCND2 cyclin D2

TABLE 5 Genes with elevated expression in transition from DCIS to IDCBRC NO. ACCESSION NO. Symbol TITLE 374 U74612 FOXM1 forkhead box M1 375U63743 KIF2C kinesin family member 2C 376 D88532 PIK3R3phosphoinositide-3-kinase, regulatory subunit, polypeptide 3 (p55,gamma) 377 NM_005532 IFI27 interferon, alpha-inducible protein 27 378D14657 KIAA0101 KIAA0101 gene product 379 AF030186 GPC4 glypican 4 380Z11566 STMN1 stathmin 1/oncoprotein 18 381 U90914 NM_001304 CPDcarboxypeptidase D 382 NM_002534 OAS1 2′,5′-oligoadenylate synthetase 1,40/46 kDa 383 S67310 BF B-factor, properdin 384 AA192445 NM_020182TMEPAI transmembrane, prostate androgen induced RNA 385 AB003103 PSMD12proteasome (prosome, macropain) 26S subunit, non-ATPase, 12 386 BE878057NM_030796 DKFZP564K0822 hypothetical protein DKFZp564K0822 387 AB003698CDC7L1 CDC7 cell division cycle 7-like 1 (S. cerevisiae) 388 M91670E2-EPF ubiquitin carrier protein 389 AK023414 FLJ13352 hypotheticalprotein FLJ13352 390 L09235 ATP6V1A1 ATPase, H+ transporting, lysosomal70 kDa, V1 subunit A, isoform 1 391 AF007152 ABHD3 abhydrolase domaincontaining 3 392 U33632 KCNK1 potassium channel, subfamily K, member 1393 AA621719 NM_005496 SMC4L1 SMC4 structural maintenance of chromosomes4- like 1 (yeast) 394 AF176228 DNMT3B DNA(cytosine-5-)-methyltransferase 3 beta 395 H22566 NM_080759 DACHdachshund homolog (Drosophila) 396 AI185804 NM_212482 FN1 fibronectin 1397 AI189477 NM_002168 IDH2 isocitrate dehydrogenase 2 (NADP+),mitochondrial 398 AA205444 AP1S2 adaptor-related protein complex 1,sigma 2 subunit

TABLE 6 Genes with decreasd expression in transition from DCIS to IDCACCESSION Symbol TITLE 399 AF070609 NM_004172 SLC1A3 solute carrierfamily 1 (glial high affinity glutamate transporter), member 3 400U85267 DSCR1 Down syndrome critical region gene 1 401 NM_005397 PODXLpodocalyxin-like 402 D13811 AMT aminomethyltransferase (glycine cleavagesystem protein T) 403 X53586 ITGA6 integrin, alpha 6 404 L13288 VIPR1vasoactive intestinal peptide receptor 1 405 M12125 TPM2 tropomyosin 2(beta) 406 M65066 NM_002735 PRKAR1B protein kinase, cAMP-dependent,regulatory, type I, beta 407 AJ001183 SOX10 SRY (sex determining regionY)-box 10 408 AW241712 MXI1 MAX interacting protein 1 409 AL160111KIAA1649 KIAA1649 protein 410 X93920 DUSP6 dual specificity phosphatase6 411 AF132734 NM_021807 SEC8 secretory protein SEC8 412 AI133467 ESTs413 D88153 HYA22 HYA22 protein 414 AF014404 PTE1 peroxisomal acyl-CoAthioesterase 415 BE907755 NM_013399 C16orf5 chromosome 16 open readingframe 5 416 AA135341 NM_021078 GCN5L2 GCN5 general control of amino-acidsynthesis 5- like 2 (yeast) 417 AL110126 Homo sapiens mRNA; cDNADKFZp564H1916 (from clone DKFZp564H1916) 418 BE254330 NM_003045 Homosapiens mRNA; cDNA DKFZp564D016 (from clone DKFZp564D016) 419 BE264353RBP1 retinol binding protein 1, cellular 420 W75991 Homo sapiens, cloneIMAGE: 4249217, mRNA 421 AF091434 PDGFC platelet derived growth factor C422 W67577 CD74 CD74 antigen (invariant polypeptide of majorhistocompatibility complex, class II antigen- associated) 423 NM_002996CX3CL1 chemokine (C—X3—C motif) ligand 1 424 AA024459 ESTs 425 NM_000163GHR growth hormone receptor 426 AA858162 NM_032160 NCAG1 NCAG1 427BE327623 ESTs, Weakly similar to hypothetical protein FLJ20234 [Homosapiens] [H. sapiens] 428 BE671156 MAPRE2 microtubule-associatedprotein, RP/EB family, member 2 429 D12614 LTA lymphotoxin alpha (TNFsuperfamily, member 1) 430 L13720 MGC5560 hypothetical protein MGC5560431 U15131 ST5 suppression of tumorigenicity 5 432 Y00711 LDHB lactatedehydrogenase B 433 AI651212 Homo sapiens cDNA FLJ31125 fis, cloneIMR322000819. 434 M31159 IGFBP3 insulin-like growth factor bindingprotein 3 435 NM_014447 HSU52521 arfaptin 1 436 AB011089 TRIM2tripartite motif-containing 2 437 BF969355 NM_002612 PDK4 pyruvatedehydrogenase kinase, isoenzyme 4 438 AK025950 XM_371114 KIAA1695hypothetical protein FLJ22297 439 D86961 NM_005779 LHFPL2 lipoma HMGICfusion partner-like 2 440 AK025953 Homo sapiens cDNA: FLJ22300 fis,clone HRC04759. 441 AJ223812 CALD1 caldesmon 1 442 R40594 Homo sapienscDNA: FLJ22845 fis, clone KAIA5195. 443 AF145713 SCHIP1 schwannomininteracting protein 1 444 AK024966 FLJ21313 hypothetical proteinFLJ21313 445 NM_005596 NFIB nuclear factor I/B 446 NM_001613 ACTA2actin, alpha 2, smooth muscle, aorta 447 H03641 XM_376328 FAM13A1 familywith sequence similarity 13, member A1

TABLE 7 Genes commonly up-regulated in IDC BRC NO. ACCESSION NO. SymbolTITLE 448 X14420 COL3A1 collagen, type III, alpha 1 (Ehlers-Danlossyndrome type IV, autosomal dominant) 449 AF044588 PRC1 proteinregulator of cytokinesis 1 AF161499 HSPC150 HSPC150 protein similar toubiquitin-conjugating enzyme 451 AA789233 NM_000088 COL1A1 collagen,type I, alpha 1 452 U16306 CSPG2 chondroitin sulfate proteoglycan 2(versican) 453 NM_004425 ECM1 extracellular matrix protein 1 454NM_006855 KDELR3 KDEL (Lys-Asp-Glu-Leu; SEQ ID NO: 52) endoplasmicreticulum protein retention receptor 3 455 AI972071 NM_031966 CCNB1cyclin B1 456 AF237709 NM_018492 TOPK T-LAK cell-originated proteinkinase (SEQ ID NOS: 48-51) 457 BE747327 HIST1H1C histone 1, H1c 458J03464 COL1A2 collagen, type I, alpha 2 459 AI080640 NM_006408 AGR2anterior gradient 2 homolog (Xenepus laevis) 460 AA971042 RHPN1rhophilin, Rho GTPase binding protein 1 461 AI419398 MGC33662hypothetical protein MGC33662 462 AI149552 NM_004448 ESTs, Moderatelysimilar to ERB2_HUMAN Receptor protein-tyrosine kinase erbB-2 precursor(p185erbB2) (NEU proto-oncogene) (C-erbB-2) (Tyrosine kinase-type cellsurface receptor HER2) (MLN 19) [H. sapiens] 463 D14874 ADMadrenomedullin 464 X03674 NM_000402 G6PD glucose-6-phosphatedehydrogenase 465 NM_002358 MAD2L1 MAD2 mitotic arrest deficient-like 1(yeast) 466 BF214508 CYCS cytochrome c, somatic 467 BG030536 NM_001067TOP2A topoisomerase (DNA) II alpha 170 kDa 468 X57766 MMP11 matrixmetalloproteinase 11 (stromelysin 3) 469 AA029900 NM_015170 SULF1sulfatase 1 470 AF053306 BUB1B BUB1 budding uninhibited bybenzimidazoles 1 homolog beta (yeast) 471 AF074002 LGALS8 lectin,galactoside-binding, soluble, 8 (galectin 8)

TABLE 8 Genes commonly down-regulated in IDC BRC NO. ACCESSION NO.Symbol TITLE 472 NM_004484 GPC3 glypican 3 473 NM_006219 PIK3CBphosphoinositide-3-kinase, catalytic, beta polypeptide 474 BE793000 RBP1retinol binding protein 1, cellular 475 AL117565 NM_033027 AXUD1 AXIN1up-regulated 1 476 BF055342 ZNF6 zinc finger protein 6 (CMPX1) 477U03688 CYP1B1 cytochrome P450, family 1, subfamily B, polypeptide 1 478AF038193 NM_004311 Homo sapiens, clone IMAGE: 3610040, mRNA 479 X72760NM_002292 LAMB2 laminin, beta 2 (laminin S) 480 J03817 GSTM1 glutathioneS-transferase M1 481 M69226 MAOA monoamine oxidase A 482 BF690180NM_006990 WASF2 WAS protein family, member 2 483 AL133600 STAM2 signaltransducing adaptor molecule (SH3 domain and ITAM motif) 2 484 AF215981GPR2 G protein-coupled receptor 2 485 BG149764 Homo sapiens, cloneIMAGE: 5286091, mRNA, partial cds 486 AF067800 CLECSF6 C-type (calciumdependent, carbohydrate- recognition domain) lectin, superfamily member6 487 AA713487 PIK3R1 phosphoinositide-3-kinase, regulatory subunit,polypeptide 1 (p85 alpha) 488 AA828505 FBXW7 F-box and WD-40 domainprotein 7 (archipelago homolog, Drosophila) 489 AK021865 CKIP-1 CK2interacting protein 1; HQ0024c protein 490 AK001605 FLJ10743hypothetical protein FLJ10743 491 AI041186 HSPC182 HSPC182 protein 492AA873363 NM_144650 ADH8 alcohol dehydrogenase 8 493 NM_013409 FSTfollistatin 494 AK000322 FLJ20315 hypothetical protein FLJ20315 495AB020637 XM_290546 KIAA0830 KIAA0830 protein 496 AA872040 INHBB inhibin,beta B (activin AB beta polypeptide) 497 NM_004430 EGR3 early growthresponse 3 498 D59989 ESTs 499 D78013 DPYSL2 dihydropyrimidinase-like 2500 AI081821 Homo sapiens mRNA; cDNA DKFZp313M0417 (from cloneDKFZp313M0417) 501 AA309603 KIAA1430 KIAA1430 protein 502 NM_004107FCGRT Fc fragment of IgG, receptor, transporter, alpha 503 AW268719 Homosapiens cDNA FLJ32438 fis, clone SKMUS2001402. 504 BF446578 NM_145313LOC221002 CG4853 gene product 505 BG054844 NM_005168 ARHE ras homologgene family, member E 506 AF054987 ALDOC aldolase C,fructose-bisphosphate 507 AI052390 FLJ20071 dymeclin 508 NM_004530 MMP2matrix metalloproteinase 2 (gelatinase A, 72 kDa gelatinase, 72 kDa typeIV collagenase) 509 AF054999 NM_001431 EPB41L2 erythrocyte membraneprotein band 4.1-like 2 510 AU151591 NM_182964 NAV2 neuron navigator 2511 AA447744 ESTs 512 R61253 ST6GalII beta-galactosidealpha-2,6-sialyltransferase II

TABLE 9 Primer sequences for semi-quantitative RT-PCR experimentsACCESSION NO. Symbol Forward primer Reverse primer AI261804 EST5′-CTGTTCTGGC TTCGTTAT 5′-AGAAAATACG GTCCTCTTG GT TCT-3′ T TGC-3′(SEQ ID NO: 1) (SEQ ID NO: 2) AA205444 AP1S2 5′-CACTGTAATG CACGACAT5′-GTTACAGCTT AGCACAAGG TT GA-3′ C ATC-3′ (SEQ ID NO: 3) (SEQ ID NO: 4)AA167194 LOC253 5′-ACCTCTGAGT TTGATTTC 5′-CGAGGCTTGT AACAATCTA 782CC AA-3′ C TGG-3′ (SEQ ID NO: 5) (SEQ ID NO: 6) AA676987 EST5′-GAAACTGTAC GGGGGTTA 5′-CATCAATGTG GTGAGTGAC AA GAG-3′ A TCT-3′(SEQ ID NO: 7) (SEQ ID NO: 8) H22566 DACH 5′-AAGCCCTTGG AACAGAAC5′-CAGTAAACGT GGTTCTCAC AT ACT-3′ A TTG-3′ (SEQ ID NO: 9)(SEQ ID NO: 10) NM_018492  TOPK 5′-AGACCCTAAAGATCGTCCT5′-GTGTTTTAAGTCAGCATGAG TCTG-3′ CAG-3′ (SEQ ID NO: 13) (SEQ ID NO: 14)NM_002046 GAPD 5′-CGACCACTTT GTCAAGCT 5′-GGTTGAGCAC AGGGTACTT CA-3′T ATT-3′ (SEQ ID NO: 11) (SEQ ID NO: 12)

TABLE 10 List of genes with altered expression between well and poorlydifferentiated type in single case BRC NO. ACCESSION NO. Symbol TITLEp-value 513 AV729269 XM_371074 DKFZP564D166 putative ankyrin-repeatcontaining protein 3.1E−07 514 AI246554 NM_014222 NDUFA8 NADHdehydrogenase (ubiquinone) 1 1.4E−06 alpha subcomplex, 8, 19 kDa 515J04080 C1S complement component 1, s 1.4E−05 subcomponent 516 N93264 ESTHomo sapiens, clone IMAGE: 4908933, 1.4E−05 mRNA 517 NM_002318 LOXL2lysyl oxidase-like 2 1.6E−05 518 J03464 COL1A2 collagen, type I, alpha 22.4E−05 519 U01184 NM_002018 FLII flightless I homolog (Drosophila)2.5E−05 520 X63556 FBN1 fibrillin 1 (Marfan syndrome) 3.8E−05 521 X78137PCBP1 poly(rC) binding protein 1 4.6E−05 522 AK021534 EST Homo sapienscDNA FLJ11472 fis, clone 6.3E−05 HEMBA1001711. 523 AK024012 NPD002NPD002 protein 6.3E−05 524 AI200892 BIK BCL2-interacting killer(apoptosis- 9.1E−05 inducing) 525 J03040 SPARC secreted protein, acidic,cysteine-rich 9.3E−05 (osteonectin) 526 AW970143 C6orf49 chromosome 6open reading frame 49 1.0E−04 527 D62873 EST Homo sapiens, clone IMAGE:5288080, 1.2E−04 mRNA 528 D42041 G2AN alpha glucosidase II alpha subunit1.2E−04 529 AI376418 EST Homo sapiens cDNA FLJ35169 fis, clone 1.7E−04PLACE6012908. 530 AK026744 NM_024911 FLJ23091 hypothetical proteinFLJ23091 1.8E−04 531 AF026292 CCT7 chaperonin containing TCP1, subunit 72.0E−04 (eta) 532 Y10805 HRMT1L2 HMT1 hnRNP methyltransferase-like 22.1E−04 (S. cerevisiae) 533 L12350 THBS2 thrombospondin 2 2.1E−04 534AK025706 AMPD2 adenosine monophosphate deaminase 2 2.4E−04 (isoform L)535 BE618804 PIG11 p53-induced protein 2.5E−04 536 AV713686 RPS29ribosomal protein S29 2.8E−04 537 M26481 TACSTD1 tumor-associatedcalcium signal 2.8E−04 transducer 1 538 D00099 ATP1A1 ATPase, Na+/K+transporting, alpha 1 2.9E−04 polypeptide 539 AA946602 ORMDL2 ORM1-like2 (S. cerevisiae) 2.9E−04 540 NM_001533 HNRPL heterogeneous nuclearribonucleoprotein L 3.9E−04 541 BG107866 SIVA CD27-binding (Siva)protein 4.4E−04 542 W72297 NM_017866 FLJ20533 hypothetical proteinFLJ20533 4.4E−04 543 U76992 HTATSF1 HIV TAT specific factor 1 4.8E−04544 AA191454 NM_198897 FIBP fibroblast growth factor (acidic) 4.9E−04intracellular binding protein 545 BE903483 RPS20 ribosomal protein S205.4E−04 546 AJ005282 NPR2 natriuretic peptide receptor B/guanylate5.5E−04 cyclase B (atrionatriuretic peptide receptor B) 547 D86322 CLGNcalmegin 5.7E−04 548 AA621665 EST EST 5.8E−04 549 M77349 TGFBItransforming growth factor, beta-induced, 6.3E−04 68 kDa 550 BE176466ZAP3 ZAP3 protein 6.6E−04 551 AA776882 NM_030795 STMN4 stathmin-like 47.1E−04 552 AI261382 NM_016334 SH120 putative G-protein coupled receptor7.1E−04 553 AB007618 COX7A2L cytochrome c oxidase subunit VIIa 7.2E−04polypeptide 2 like 554 D21261 TAGLN2 transgelin 2 7.5E−04 555 M68864LOC51035 ORF 7.7E−04 556 AB007836 TGFB1I1 transforming growth factorbeta 1 induced 8.1E−04 transcript 1 557 AA173339 EST EST 8.4E−04 558D87810 PMM1 phosphomannomutase 1 8.4E−04 559 M15798 NM_183356 ASNSasparagine synthetase 8.7E−04 560 AW072418 B7 B7 protein 9.0E−04 561D38293 AP3M2 adaptor-related protein complex 3, mu 2 9.5E−04 subunit 562NM_018950 HLA-F major histocompatibility complex, class I, F 1.0E−03 563NM_001219 CALU calumenin 1.1E−03 564 J04162 FCGR3A Fc fragment of IgG,low affinity IIIa, 1.1E−03 receptor for (CD16) 565 U09873 FSCN1 fascinhomolog 1, actin-bundling protein 1.1E−03 (Strongylocentrotuspurpuratus) 566 N51082 NM_080759 DACH dachshund homolog (Drosophila)1.3E−03 567 NM_004199 P4HA2 procollagen-proline, 2-oxoglutarate 4-1.3E−03 dioxygenase (proline 4-hydroxylase), alpha polypeptide II 568BE904196 GNB1 guanine nucleotide binding protein (G 1.3E−03 protein),beta polypeptide 1 569 L08895 MEF2C MADS box transcription enhancerfactor 1.3E−03 2, polypeptide C (myocyte enhancer factor 2C) 570AK022670 NM_016649 C20orf6 chromosome 20 open reading frame 6 1.3E−03571 AW157725 POLR2F polymerase (RNA) II (DNA directed) 1.4E−03polypeptide F 572 NM_004939 DDX1 DEAD/H (Asp-Glu-Ala-Asp/His) box1.4E−03 polypeptide 1 573 X65463 NM_021976 RXRB retinoid X receptor,beta 1.5E−03 574 Z68179 LY6E lymphocyte antigen 6 complex, locus E1.5E−03 575 BF976420 SNRPF small nuclear ribonucleoprotein 1.5E−03polypeptide F 576 D79986 BTF Bcl-2-associated transcription factor1.5E−03 577 AK001023 NUBP2 nucleotide binding protein 2 (MinD 1.6E−03homolog, E. coli) 578 BE065329 EST EST 1.6E−03 579 L34600 MTIF2mitochondrial translational initiation 1.7E−03 factor 2 580 D13630 BZW1basic leucine zipper and W2 domains 1 1.7E−03 581 X15880 NM_001848COL6A1 collagen, type VI, alpha 1 1.7E−03 582 AB003723 PIGQphosphatidylinositol glycan, class Q 1.7E−03 583 L36645 EPHA4 EphA41.7E−03 584 BF974358 RPS27 ribosomal protein S27 1.8E−03(metallopanstimulin 1) 585 AA747449 HIP2 huntingtin interacting protein2 1.9E−03 586 AA283813 FLJ12150 hypothetical protein FLJ12150 2.0E−03587 L38995 NM_003321 TUFM Tu translation elongation factor, 2.0E−03mitochondrial 588 N67293 EST Homo sapiens cDNA FLJ11997 fis, clone2.1E−03 HEMBB1001458. 589 AB014549 KIAA0649 KIAA0649 gene product2.1E−03 590 D38305 TOB1 transducer of ERBB2, 1 2.2E−03 591 L40391NM_006827 TMP21 transmembrane trafficking protein 2.2E−03 592 H28960 ESTESTs 2.2E−03 593 U86753 CDC5L CDC5 cell division cycle 5-like (S. pombe)2.3E−03 594 AI143226 BLP1 BBP-like protein 1 2.3E−03 595 M57730 EFNA1ephrin-A1 2.3E−03 596 AI928868 UBR1 ubiquitin protein ligase E3component n- 2.3E−03 recognin 1 597 AF077044 RPAC2 likely ortholog ofmouse RNA polymerase 2.3E−03 1-3 (16 kDa subunit) 598 AF097431 LEPRE1leucine proline-enriched proteoglycan 2.4E−03 (leprecan) 1 599 NM_004350RUNX3 runt-related transcription factor 3 2.4E−03 600 AL162047 NCOA4nuclear receptor coactivator 4 2.5E−03 601 BF915013 EST Homo sapienscDNA FLJ37302 fis, clone 2.5E−03 BRAMY2016009. 602 Z37166 BAT1 HLA-Bassociated transcript 1 2.5E−03 603 M81349 SAA4 serum amyloid A4,constitutive 2.6E−03 604 AL137338 NM_007214 SEC63L SEC63 protein 2.6E−03605 AI745624 ELL2 ELL-related RNA polymerase II, 2.6E−03 elongationfactor 606 BG167522 HSPC016 hypothetical protein HSPC016 2.6E−03 607U58766 TSTA3 tissue specific transplantation antigen 2.7E−03 P35B 608J04474 NM_000709 BCKDHA branched chain keto acid dehydrogenase 2.7E−03E1, alpha polypeptide (maple syrup urine disease) 609 H15977 NM_021116EST Homo sapiens cDNA FLJ30781 fis, clone 2.8E−03 FEBRA2000874. 610AL049339 NM_001304 CPD carboxypeptidase D 2.8E−03 611 AL133555 NM_080821C20orf108 chromosome 20 open reading frame 108 2.9E−03 612 AW662518FLJ10876 hypothetical protein FLJ10876 2.9E−03 613 BE883507 NM_003663CGGBP1 CGG triplet repeat binding protein 1 2.9E−03 614 BE797472 RPL17ribosomal protein L17 3.0E−03 615 U41371 SF3B2 splicing factor 3b,subunit 2, 145 kDa 3.0E−03 616 L39068 DHPS deoxyhypusine synthase3.1E−03 617 NM_004517 ILK integrin-linked kinase 3.1E−03 618 U14972RPS10 ribosomal protein S10 3.2E−03 619 U61500 TMEM1 transmembraneprotein 1 3.3E−03 620 NM_002719 PPP2R5C protein phosphatase 2,regulatory subunit 3.3E−03 B (B56), gamma isoform 621 AF053233 VAMP8vesicle-associated membrane protein 8 3.3E−03 (endobrevin) 622 NM_002822NM_198974 PTK9 PTK9 protein tyrosine kinase 9 3.3E−03 623 U16996 DUSP5dual specificity phosphatase 5 3.3E−03 624 AV705747 NM_006276 SFRS7splicing factor, arginine/serine-rich 7, 3.3E−03 35 kDa 625 AF178984IER5 immediate early response 5 3.3E−03 626 Z29093 DDR1 discoidin domainreceptor family, member 1 3.3E−03 627 AB024536 ISLR immunoglobulinsuperfamily containing 3.3E−03 leucine-rich repeat 628 BF791601 EMP2epithelial membrane protein 2 3.3E−03 629 AF061737 SPC18 signalpeptidase complex (18 kD) 3.3E−03 630 AB002386 EZH1 enhancer of zestehomolog 1 (Drosophila) 3.5E−03 631 AA634090 EST Homo sapiens, Similar toheterogeneous 3.5E−03 nuclear ribonucleoprotein A1, clone IMAGE:2900557, mRNA 632 AK023674 FLJ13612 likely ortholog of neuronallyexpressed 3.6E−03 calcium binding protein 633 D13626 GPR105 Gprotein-coupled receptor 105 3.7E−03 634 AK026849 XM_371844 TSPYLTSPY-like 3.8E−03 635 Y18643 METTL1 methyltransferase-like 1 3.9E−03 636AF176699 FBXL4 F-box and leucine-rich repeat protein 4 3.9E−03 637NM_003977 AIP aryl hydrocarbon receptor interacting 3.9E−03 protein 638AK000498 HARS histidyl-tRNA synthetase 4.0E−03 639 U05237 NM_004459 FALZfetal Alzheimer antigen 4.0E−03 640 BF696304 NM_032832 FLJ14735hypothetical protein FLJ14735 4.0E−03 641 X14420 COL3A1 collagen, typeIII, alpha 1 (Ehlers-Danlos 4.1E−03 syndrome type IV, autosomaldominant) 642 BE796098 NDUFS8 NADH dehydrogenase (ubiquinone) Fe—S4.3E−03 protein 8, 23 kDa (NADH-coenzyme Q reductase) 643 X60221 ATP5F1ATP synthase, H+ transporting, 4.4E−03 mitochondrial F0 complex, subunitb, isoform 1 644 AA135341 NM_021078 GCN5L2 GCN5 general control ofamino-acid 4.6E−03 synthesis 5-like 2 (yeast) 645 AF009368 CREB3 cAMPresponsive element binding protein 4.7E−03 3 (luman) 646 BF970013 SPC12signal peptidase 12 kDa 4.7E−03 647 W45522 ATPIF1 ATPase inhibitoryfactor 1 4.7E−03 648 AI733356 NM_006306 EST Homo sapiens cDNA FLJ31746fis, clone 4.8E−03 NT2RI2007334. 649 AW117927 EIF3S9 eukaryotictranslation initiation factor 3, 4.8E−03 subunit 9 eta, 116 kDa 650AF275798 NM_012073 CCT5 chaperonin containing TCP1, subunit 5 5.0E−03(epsilon) 651 AI937126 WTAP Wilms' tumour 1-associating protein 5.0E−03652 AK024891 NM_203463 LOC253782 hypothetical protein LOC253782 5.1E−03653 D13629 KTN1 kinectin 1 (kinesin receptor) 5.2E−03 654 AI682994AHCYL1 S-adenosylhomocysteine hydrolase-like 1 5.3E−03 655 BF980325NM_005742 ATP6V1C2 ATPase, H+ transporting, lysosomal 5.3E−03 42 kDa, V1subunit C isoform 2 656 AI378996 NM_005381 NCL nucleolin 5.3E−03 657D88153 HYA22 HYA22 protein 5.3E−03 658 S67310 BF B-factor, properdin5.4E−03 659 AW438585 EST Homo sapiens, clone IMAGE: 5273745, 5.4E−03mRNA 660 M12267 OAT ornithine aminotransferase (gyrate 5.5E−03 atrophy)661 AB001636 DDX15 DEAD/H (Asp-Glu-Ala-Asp/His) box 5.7E−03 polypeptide15 662 D13315 GLO1 glyoxalase I 5.9E−03 663 AF244931 WDR10 WD repeatdomain 10 5.9E−03 664 AL050094 IDH3B isocitrate dehydrogenase 3 (NAD+)beta 6.0E−03 665 AK022881 KIAA1272 KIAA1272 protein 6.0E−03 666 AI720096RPL29 ribosomal protein L29 6.1E−03 667 Y12781 TBL1X transducin(beta)-like 1X-linked 6.2E−03 668 AI014538 NM_138384 LOC92170hypothetical protein BC004409 6.2E−03 669 NM_020987 ANK3 ankyrin 3, nodeof Ranvier (ankyrin G) 6.3E−03 670 NM_004387 NKX2-5 NK2 transcriptionfactor related, locus 5 6.3E−03 (Drosophila) 671 J03817 GSTM1glutathione S-transferase M1 6.3E−03 672 BF435769 EST ESTs, Weaklysimilar to hypothetical 6.5E−03 protein FLJ20378 [Homo sapiens] [H.sapiens] 673 AL390147 DKFZp547D065 hypothetical protein DKFZp547D0656.5E−03 674 AA961412 NM_003333 UBA52 ubiquitin A-52 residue ribosomalprotein 6.6E−03 fusion product 1 675 NM_002702 POU6F1 POU domain, class6, transcription factor 1 6.6E−03 676 M58050 MCP membrane cofactorprotein (CD46, 6.6E−03 trophoblast-lymphocyte cross-reactive antigen)677 NM_001293 CLNS1A chloride channel, nucleotide-sensitive, 1A 6.7E−03678 BF213049 COX7A2 cytochrome c oxidase subunit VIIa 6.7E−03polypeptide 2 (liver) 679 AF236056 GOLPH2 golgi phosphoprotein 2 6.7E−03680 U79285 NM_021079 NMT1 N-myristoyltransferase 1 6.8E−03 681 AB027196RNF10 ring finger protein 10 6.9E−03 682 AA036952 FLJ30973 hypotheticalprotein FLJ30973 7.0E−03 683 AW732157 NM_052963 TOP1MT mitochondrialtopoisomerase I 7.1E−03 684 AL049319 NM_032804 FLJ14547 hypotheticalprotein FLJ14547 7.3E−03 685 BE613161 EST Homo sapiens cDNA FLJ37042fis, clone 7.3E−03 BRACE2011947. 686 U28749 HMGA2 high mobility groupAT-hook 2 7.3E−03 687 BF793677 MGC49942 hypothetical protein MGC499427.4E−03 688 BG032216 NM_017746 FLJ20287 hypothetical protein FLJ202877.4E−03 689 AL449244 PP2447 hypothetical protein PP2447 7.5E−03 690AK024103 EST Homo sapiens cDNA FLJ14041 fis, clone 7.5E−03 HEMBA1005780.691 U17838 PRDM2 PR domain containing 2, with ZNF 7.5E−03 domain 692D86479 NM_001129 AEBP1 AE binding protein 1 7.5E−03 693 D50420 NHP2L1NHP2 non-histone chromosome protein 2- 7.5E−03 like 1 (S. cerevisiae)694 D87258 PRSS11 protease, serine, 11 (IGF binding) 7.5E−03 695BF434108 NM_014187 HSPC171 HSPC171 protein 7.6E−03 696 NM_000705 ATP4BATPase, H+/K+ exchanging, beta 7.7E−03 polypeptide 697 AF077599 SBB103hypothetical SBBI03 protein 7.7E−03 698 NM_001530 HIF1Ahypoxia-inducible factor 1, alpha subunit 7.8E−03 (basichelix-loop-helix transcription factor) 699 AB023204 EPB41L3 erythrocytemembrane protein band 4.1- 7.8E−03 like 3 700 AA253194 NM_022121 PIGPC1p53-induced protein PIGPC1 7.9E−03 701 BE502341 NM_139177 C17orf26chromosome 17 open reading frame 26 7.9E−03 702 AL050265 TARDBP TAR DNAbinding protein 8.0E−03 703 AK001643 NM_018215 FLJ10781 hypotheticalprotein FLJ10781 8.3E−03 704 BG179412 COX7B cytochrome c oxidase subunitVIIb 8.6E−03 705 X03212 KRT7 keratin 7 8.8E−03 706 L07033 HMGCL3-hydroxymethyl-3-methylglutaryl- 9.0E−03 Coenzyme A lyase(hydroxymethylglutaricaciduria) 707 M19383 ANXA4 annexin A4 9.0E−03 708NM_001273 CHD4 chromodomain helicase DNA binding 9.1E−03 protein 4 709NM_004461 FARSL phenylalanine-tRNA synthetase-like 9.1E−03 710 AI192880CD44 CD44 antigen (homing function and 9.1E−03 Indian blood groupsystem) 711 AF038961 MPDU1 mannose-P-dolichol utilization defect 19.5E−03 712 U67322 C20orf18 chromosome 20 open reading frame 18 9.5E−03713 AA521017 EST EST 9.5E−03 714 AA811043 NM_003730 RNASE6PLribonuclease 6 precursor 9.9E−03 715 AA536113 TMEPAI transmembrane,prostate androgen 9.9E−03 induced RNA 716 BF973104 LOC201725hypothetical protein LOC201725 9.9E−03 717 NM_000293 PHKB phosphorylasekinase, beta 9.9E−03 718 NM_000548 TSC2 tuberous sclerosis 2 1.0E−02

TABLE 11 List of genes with altered expression between node-positive andnode-negative tumors BRC NO. ACCESSION NO. Symbol TITLE P-value + or −719 BF686125 UBA52 ubiquitin A-52 residue ribosomal 8.1E−09 − proteinfusion product 1 720 AA634090 Homo sapiens, Similar to 1.4E−07 −heterogeneous nuclear ribonucleoprotein A1, clone IMAGE: 2900557, mRNA721 L00692 CEACAM3 carcinoembryonic antigen-related cell 4.2E−07 −adhesion molecule 3 722 AW954403 NM_004781 VAMP3 vesicle-associatedmembrane protein 3 2.2E−06 + (cellubrevin) 723 AA865619 C21orf97chromosome 21 open reading frame 97 2.6E−06 − 724 W74502 NM_032350MGC11257 hypothetical protein MGC11257 2.4E−05 + 725 NM_002094 GSPT1 G1to S phase transition 1 2.7E−05 + 726 T55178 KIAA1040 KIAA1040 protein3.2E−05 − 727 L36983 DNM2 dynamin 2 4.1E−05 + 728 Z21507 EEF1Deukaryotic translation elongation factor 5.2E−05 − 1 delta (guaninenucleotide exchange protein) 729 AI581728 NM_005507 CFL1 cofilin 1(non-muscle) 8.0E−05 + 730 NM_001293 CLNS1A chloride channel,nucleotide-sensitive, 9.0E−05 + 1A 731 BF680847 SENP2 sentrin-specificprotease 9.0E−05 + 732 AF100743 NDUFS3 NADH dehydrogenase (ubiquinone)9.8E−05 + Fe—S protein 3, 30 kDa (NADH- coenzyme Q reductase) 733NM_004960 FUS fusion, derived from t(12; 16) 9.8E−05 − malignantliposarcoma 734 AK023975 NM_015934 NOP5/NOP58 nucleolar proteinNOP5/NOP58 1.3E−04 + 735 AF083245 PSMD13 proteasome (prosome, macropain)26S 1.5E−04 + subunit, non-ATPase, 13 736 AA129776 SUOX sulfite oxidase1.8E−04 + 737 U55766 NM_007043 HRB2 HIV-1 rev binding protein 22.0E−04 + 738 BF526092 LOC154467 hypothetical protein BC003515 2.1E−04 +739 BF677579 XM_370754 THTPA thiamine triphosphatase 2.3E−04 + 740X98260 ZRF1 zuotin related factor 1 2.3E−04 + 741 BE440010 LOC51255hypothetical protein LOC51255 2.7E−04 + 742 AF007165 NM_021008 DEAF1deformed epidermal autoregulatory 2.7E−04 + factor 1 (Drosophila) 743X78687 NEU1 sialidase 1 (lysosomal sialidase) 3.0E−04 + 744 AW965200Homo sapiens, clone 3.1E−04 − IMAGE: 5286019, mRNA 745 AK023240 UGCGL1UDP-glucose ceramide 3.1E−04 + glucosyltransferase-like 1 746 M95712BRAF v-raf murine sarcoma viral oncogene 3.7E−04 + homolog B1 747 L38995NM_003321 TUFM Tu translation elongation factor, 3.9E−04 + mitochondrial748 AW014268 FLJ10726 hypothetical protein FLJ10726 4.2E−04 + 749 D49547DNAJB1 DnaJ (Hsp40) homolog, subfmaily B, 4.4E−04 + member 1 750BE466450 AP4S1 adaptor-related protein complex 4, 4.5E−04 + sigma 1subunit 751 AB007944 KIAA0475 KIAA0475 gene product 4.9E−04 − 752AF034091 MRPL40 mitochondrial ribosomal protein L40 5.1E−04 +

TABLE 12 Histoclinical information memo age in pause HistrogicalLymphocytic Angioin ID operation status T N M Stage type infiltratevasion ER PgR MMK010003 51 pre 2 1 0 2 a3 3 0 + + MMK010004 47 pre 2 1 02 a1 0 0 + + MMK010005 44 pre 2 0 0 2 a1 1 0 + + MMK010013 45 pre 2 1 02 a1 1 0 − − MMK010016 44 pre 2 0 0 2 a2 0 0 − − MMK010025 46 pre 2 0 02 a1 0 0 + + MMK010031 29 pre 2 2 0 3 a3 3 0 − − MMK010037 62 post 0 0 00 Ia 0 0 + + MMK010042 47 pre 2 1 0 2 a3 1 2 + + MMK010086 42 pre 2 0 02 a1 0 0 + + MMK010102 51 pre 2 1 0 3 a2 3 0 + + MMK010110 39 pre 2 0 02 a1 2 0 − − MMK010129 52 pre 2 2 0 3 a1 2 0 − − MMK010135 41 pre 2 0 02 a1 0 0 + + MMK010138 38 pre 2 0 0 2 a1 0 0 + + MMK010145 51 pre 2 1 02 a3 0 0 + + MMK010147 49 pre 2 1 0 2 a1 1 0 + + MMK010149 35 pre 2 0 02 a3 1 0 − − MMK010175 38 pre 2 0 0 2 a3 0 0 + + MMK010178 51 pre 0 0 00 Ia 0 0 + + MMK010207 40 pre 2 0 0 2 a1 0 0 + + MMK010214 42 pre 2 1 02 a1 0 0 − − MMK010247 48 pre 2 1 0 2 a2 3 0 − − MMK010252 52 pre 2 1 02 a2 0 0 − − MMK010255 47 pre 2 0 0 2 a2 0 0 − − MMK010302 46 pre 2 1 02 a2 2 1 − − MMK010304 48 pre 2 1 0 2 a3 1 0 + + MMK010326 53 post 0 0 00 Ia 0 0 − − MMK010327 43 pre 2 1 0 2 a1 1 1 + + MMK010341 42 pre 2 1 02 a1 2 0 + + MMK010370 46 pre 2 1 0 2 a3 2 0 + + MMK010397 38 pre 2 1 02 a3 3 2 + + MMK010411 46 pre 2 0 0 2 a1 0 0 + + MMK010431 50 pre 2 0 02 a3 0 0 − − MMK010435 49 pre 2 1 0 2 a3 0 0 + + MMK010453 49 pre 2 1 02 a3 3 0 + + MMK010471 42 pre 2 1 0 2 a1 3 0 − − MMK010473 40 pre 2 1 02 a2 0 0 − − MMK010478 38 pre 2 2 0 3 a2 0 0 + + MMK010491 46 pre 2 0 02 a3 1 0 + + MMK010497 44 pre 0 0 0 0 Ia 0 0 − + MMK010500 45 pre 2 0 02 a1 0 0 + + MMK010502 51 pre 2 0 0 2 a2 0 0 − − MMK010508 51 pre 2 1 02 a2 0 0 − − MMK010521 21 pre 2 0 0 2 a1 1 1 − − MMK010552 49 pre 2 0 02 a2 0 0 − − MMK010554 51 pre 2 0 0 2 a3 2 0 + + MMK010571 45 pre 2 1 14 a3 3 0 + + MMK010591 40 pre 0 0 0 0 Ia 0 0 − + MMK010613 37 pre 0 0 00 Ia 0 0 − + MMK010623 39 pre 2 1 0 2 a1 3 0 + + MMK010624 39 pre 2 1 02 a1 3 0 + + MMK010626 48 pre 2 0 0 2 a1 1 1 − − MMK010631 41 pre 2 0 02 a1 0 0 + + MMK010640 35 pre 0 0 0 0 Ia 0 0 + + MMK010644 47 pre 2 2 02 a3 3 0 + + MMK010646 37 pre 2 1 0 2 a3 1 0 + + MMK010660 46 pre 2 0 02 a1 0 0 − − MMK010671 45 pre 2 0 0 2 a1 0 0 − − MMK010679 68 post 0 0 00 Ia 0 0 + + MMK010680 58 post 0 0 0 0 Ia 0 0 − + MMK010709 33 pre 2 0 02 a3 0 2 − − MMK010711 51 pre 0 0 0 0 Ia 0 0 − + MMK010724 40 pre 2 1 02 a3 3 2 + + MMK010744 41 pre 0 0 0 0 Ia 0 0 + + MMK010758 40 pre 2 1 02 a1 0 1 + + MMK010760 42 pre 2 0 0 2 a1 0 0 + + MMK010762 50 pre 2 1 02 a3 3 1 + + MMK010769 33 pre 2 0 0 2 a2 0 0 − − MMK010772 45 pre 2 1 02 a3 2 0 − − MMK010779 46 pre 2 1 0 2 a2 0 1 − − MMK010780 31 pre 2 0 02 a2 0 0 − − MMK010781 44 pre 2 0 0 2 a3 0 2 + + MMK010794 52 pre 2 1 02 a3 2 1 + + MMK010818 51 pre 2 0 0 2 a1 0 2 + + MMK010835 42 pre 0 0 00 Ia 0 0 + + MMK010846 47 pre 2 0 0 2 a1 0 0 + + MMK010858 42 pre 2 1 02 a3 2 3 + + MMK010864 52 pre 2 1 0 2 a1 0 1 − − MMK010869 45 pre 2 0 02 a1 0 1 − − MMK010903 47 pre 2 0 0 2 a1 0 0 + +

TABLE 13 Si1-F 5′-CACCGAACGATATAAAGCCAGCCTTCAAGAGAGGC SEQ ID NO. 23TGGCTTTATATCGTTC-3′ Si1-R 5′-AAAAGAACGATATAAAGCCAGCCTCTCTTGAAGGCSEQ ID NO. 24 TGGCTTTATATCGTTC-3′ Si1-Target 5′-GAACGATATAAAGCCAGCC-3′SEQ ID NO. 25 Si3-F 5′-CACCCTGGATGAATCATACCAGATTCAAGAGATCT SEQ ID NO. 26GGTATGATTCATCCAG-3′ Si3-R 5′-AAAACTGGATGAATCATACCAGATCTCTTGAATCTSEQ ID NO. 27 GGTATGATTCATCCAG-3′ Si3-Target 5′-CTGGATGAATCATACCAGA-3′SEQ ID NO. 28 Si4-F 5′-CACCGTGTGGCTTGCGTAAATAATTCAAGAGATTA SEQ ID NO. 29TTTACGCAAGCCACAC-3′ Si4-R 5′-AAAAGTGTGGCTTGCGTAAATAATCTCTTGAATTASEQ ID NO. 30 TTTACGCAAGCCACAC-3′ Si4-Target 5′-GTGTGGCTTGCGTAAATAA-3′SEQ ID NO. 31

INDUSTRIAL APPLICABILITY

The gene-expression analysis of breast cancer described herein, obtainedthrough a combination of laser-capture dissection and genome-wide cDNAmicroarray, has identified specific genes as targets for cancerprevention and therapy. Based on the expression of a subset of thesedifferentially expressed genes, the present invention provides moleculardiagnostic markers for identifying and detecting breast cancer.

The methods described herein are also useful in the identification ofadditional molecular targets for prevention, diagnosis and treatment ofbreast cancer. The data reported herein add to a comprehensiveunderstanding of breast cancer, facilitate development of noveldiagnostic strategies, and provide clues for identification of moleculartargets for therapeutic drugs and preventative agents. Such informationcontributes to a more profound understanding of breast tumorigenesis,and provide indicators for developing novel strategies for diagnosis,treatment, and ultimately prevention of breast cancer.

All patents, patent applications, and publications cited herein areincorporated by reference in their entirety.

Furthermore, while the invention has been described in detail and withreference to specific embodiments thereof, it is to be understood thatthe foregoing description is exemplary and explanatory in nature and isintended to illustrate the invention and its preferred embodiments.Through routine experimentation, one skilled in the art will readilyrecognize that various changes and modifications can be made thereinwithout departing from the spirit and scope of the invention. Thus, theinvention is intended to be defined not by the above description, but bythe following claims and their equivalents.

1. A method of diagnosing breast cancer or a predisposition fordeveloping breast cancer in a subject, comprising determining a level ofexpression of BRC No. 456 in a patient-derived biological sample,wherein an increase or decrease in said sample expression level ascompared to a normal control level of said gene indicates that saidsubject suffers from or is at risk of developing breast cancer. 2.(canceled)
 3. The method of claim 1, wherein said sample expressionlevel is at least 10% greater than said normal control level. 4-5.(canceled)
 6. The method of claim 1, wherein said breast cancer is IDC.7-10. (canceled)
 11. The method of claim 1, wherein said method furthercomprises determining the level of expression of a plurality of breastcancer-associated genes.
 12. The method of claim 1, wherein geneexpression level is determined by a method selected from the groupconsisting of: (a) detecting mRNA of BRC No. 456, (b) detecting aprotein encoded by BRC No. 456, and (c) detecting a biological activityof a protein encoded by BRC No.
 456. 13. The method of claim 12, whereinsaid detection is carried out on a DNA array.
 14. The method of claim 1,wherein said patient-derived biological sample comprises an epithelialcell.
 15. The method of claim 1, wherein said patient-derived biologicalsample comprises a breast cell.
 16. The method of claim 1 wherein saidpatient-derived biological sample comprises an epithelial cell from abreast tissue.